Transformers for Lexical Complexity Pbkp_rediction in Spanish Language
Abstract:
In this article we have presented a contribution to the pbkp_rediction of the complexity of simple words in the Spanish language whose foundation is based on the combination of a large number of features of different types. We obtained the results after run the fined models based on Transformers and executed on the pretrained models BERT, XLM-RoBERTa, and RoBERTa-large-BNE in the different datasets in Spanish and executed on several regression algorithms. The evaluation of the results determined that a good performance was achieved with a Mean Absolute Error (MAE) = 0.1598 and Pearson = 0.9883 achieved with the training and evaluation of the Random Forest Regressor algorithm for the refined BERT model. As a possible alternative proposal to achieve a better pbkp_rediction of lexical complexity, we are very interested in continuing to carry out experimentations with data sets for Spanish, testing state-of-the-art Transformer models.
Año de publicación:
2022
Keywords:
- Encodings
- pbkp_rediction
- TRANSFORMERS
- Lexical Complexity
Fuente:
Tipo de documento:
Article
Estado:
Acceso restringido
Áreas de conocimiento:
- Inteligencia artificial
- Ciencias de la computación
Áreas temáticas:
- Lingüística
- Lengua
- Italiano, rumano y lenguas afines