Automated Web Annotator of Biomedical Entities in Spanish Language


Abstract:

In Natural Language Processing (NLP) and supervised machine learning, the scarcity of labeled corpora results in poor performance of machine learning models. In the medical domain, there are also fewer labeled corpora in Spanish than in English. We propose a method to identify biomedical entities in Spanish-language clinical texts, through automatic translation and word alignment, by translating the source text (Spanish) to the target text (English), then labeling the target text with automatic biomedical entity taggers, and finally projecting the biomedical entities from the target text to the corresponding text sections in the source text by means of word alignment generated in the translation process. This is done with the objective of annotating the source text with English language tools (automatic annotators). As a result, an efficient method capable of processing and annotating biomedical entities in the Spanish language with high precision is obtained, since it integrates several automatic annotators in a single web system.

Año de publicación:

2022

Keywords:

  • web system
  • biomedical entity
  • Natural language processing (NLP)
  • Automatic annotation
  • word alignment

Fuente:

scopusscopus

Tipo de documento:

Conference Object

Estado:

Acceso restringido

Áreas de conocimiento:

  • Aprendizaje automático
  • Biotecnología

Áreas temáticas:

  • Medicina y salud
  • Ciencias Naturales y Matemáticas
  • Lengua