Transformers for Lexical Complexity Pbkp_rediction in Spanish Language


Abstract:

In this article we have presented a contribution to the pbkp_rediction of the complexity of simple words in the Spanish language whose foundation is based on the combination of a large number of features of different types. We obtained the results after run the fined models based on Transformers and executed on the pretrained models BERT, XLM-RoBERTa, and RoBERTa-large-BNE in the different datasets in Spanish and executed on several regression algorithms. The evaluation of the results determined that a good performance was achieved with a Mean Absolute Error (MAE) = 0.1598 and Pearson = 0.9883 achieved with the training and evaluation of the Random Forest Regressor algorithm for the refined BERT model. As a possible alternative proposal to achieve a better pbkp_rediction of lexical complexity, we are very interested in continuing to carry out experimentations with data sets for Spanish, testing state-of-the-art Transformer models.

Año de publicación:

2022

Keywords:

  • Encodings
  • pbkp_rediction
  • TRANSFORMERS
  • Lexical Complexity

Fuente:

googlegoogle
scopusscopus

Tipo de documento:

Article

Estado:

Acceso restringido

Áreas de conocimiento:

  • Inteligencia artificial
  • Ciencias de la computación

Áreas temáticas:

  • Lingüística
  • Lengua
  • Italiano, rumano y lenguas afines