A proposal of an entity name recognition algorithm to integrate governmental databases


Abstract:

Based on the analysis of existing name recognition techniques, an improvement in efficiency of such undertaking in matching citizen's registers is proposed with the introduction of a new algorithm. In order to fulfill the mentioned conditions, a case study initiates with a spin of two representative but random samples, which assess real-life circumstances. The first sample contains a great variety of tourist's names from all over the world collected from visitors of the Galápagos Islands in the past three years, at about the last population census. The second sample has been used with islands' resident names out of the last census. The used algorithm matches the sampled with those of the citizens taken from the database of National Registration Identity Card Number Department of Ecuador in two steps. The first step separates the exact coincidences and identifies the top approximate name's coincidences through a phonetic code comparison. The second step includes a refinement of the first one carried out by a distance edition technique. To offer evidence on the effectiveness of those steps, the accuracy and viability quality factors of the algorithm has been evaluated. A final confirmation of the obtained results has been given by the use of a t Pair Test, which determines if there appear any significant differences of those factors prior and after the execution of the new algorithm.

Año de publicación:

2016

Keywords:

  • entity named recognition
  • name matching
  • natural language process
  • phonetic matching techniques

Fuente:

googlegoogle
scopusscopus

Tipo de documento:

Conference Object

Estado:

Acceso restringido

Áreas de conocimiento:

  • Algoritmo
  • Algoritmo
  • Administración pública

Áreas temáticas:

  • Funcionamiento de bibliotecas y archivos