Incorporation of discriminative n-grams to improve a phonotactic language recognizer based on i-vectors


Abstract:

This paper describes a novel technique that allows the combination of the information from two different phonotactic systems with the goal of improving the results of an automatic language recognition system. The first system is based on the creation of posteriorgram counts used for the generation of i-vectors, and the second system is a variation of the first one that takes into account the most discriminative n-grams as a function of their occurrence in one language compared to all other languages. The proposed technique allows a relative improvement of 8.63% on Cavg over the official set used for the ALBAYZIN 2012 LRE evaluation. © 2013 Sociedad Española Para el Procesamiento del Lenguaje Natural.

Año de publicación:

2013

Keywords:

  • i-Vectors
  • N-grams
  • Phonotactic
  • Discriminate rankings
  • Posteriorgram

Fuente:

googlegoogle
scopusscopus

Tipo de documento:

Article

Estado:

Acceso restringido

Áreas de conocimiento:

  • Aprendizaje automático
  • Ciencias de la computación

Áreas temáticas:

  • Lingüística
  • Lengua