Evaluating Named Entities Recognition (NER) tools vs algorithms adapted to the extraction of locations


Abstract:

Named-entity Recognition (NER) is an important area in the field of text information extraction. In recent years, several NER tools have emerged, each with its own methodologies to identify different entities and perform in several languages. These circumstances make it difficult for the user to determine the right tool for a specific application. Thus, it was proposed to evaluate NER technique via the SpaCy library, and an algorithm adapted to entity extraction (Levenshtein's Algorithm). This evaluation focused on the identification of location tags in Spanish-language texts. Based on the results, the appropriate method and the linked libraries or algorithms were obtained. This allowed to determine that, for this domain, the technique that presented the best metrics performance was SpaCy.

Año de publicación:

2021

Keywords:

  • Twitter
  • NER techniques
  • LEVENSHTEIN
  • Named entity recognition
  • information extraction

Fuente:

scopusscopus

Tipo de documento:

Conference Object

Estado:

Acceso restringido

Áreas de conocimiento:

  • Ciencias de la computación

Áreas temáticas de Dewey:

  • Programación informática, programas, datos, seguridad