Automated Web Annotator of Biomedical Entities in Spanish Language
Abstract:
In Natural Language Processing (NLP) and supervised machine learning, the scarcity of labeled corpora results in poor performance of machine learning models. In the medical domain, there are also fewer labeled corpora in Spanish than in English. We propose a method to identify biomedical entities in Spanish-language clinical texts, through automatic translation and word alignment, by translating the source text (Spanish) to the target text (English), then labeling the target text with automatic biomedical entity taggers, and finally projecting the biomedical entities from the target text to the corresponding text sections in the source text by means of word alignment generated in the translation process. This is done with the objective of annotating the source text with English language tools (automatic annotators). As a result, an efficient method capable of processing and annotating biomedical entities in the Spanish language with high precision is obtained, since it integrates several automatic annotators in a single web system.
Año de publicación:
2022
Keywords:
- web system
- biomedical entity
- Natural language processing (NLP)
- Automatic annotation
- word alignment
Fuente:
Tipo de documento:
Conference Object
Estado:
Acceso restringido
Áreas de conocimiento:
- Aprendizaje automático
- Biotecnología
Áreas temáticas:
- Medicina y salud
- Ciencias Naturales y Matemáticas
- Lengua