SINAI at PoliticEs 2022: Exploring Relative Frequency of Words in Stylometrics for Profile Discovery
Abstract:
In this article we summarise our participation in the PoliticEs task within the IberLEF evaluation forum in its 2022 edition. This task is entitled Spanish Author Profiling for Political Ideology. We proposed a Voting Classifier model that leverages the use of several classical classifiers using as features the combination of stylometry measures with embeddings obtained from a Spanish RoBERTa model for text representation. Our final work achieved an F1 score of 0.785 for Gender pbkp_rediction, 0.753 for Profession, 0.784 for Ideology_Binary and 0.561 for Ideology_Multiclass, with a final macro average for F1 of 0.721. These results indicate that the combination of stylometric features can be useful in the determination of user profiles.
Año de publicación:
2022
Keywords:
- ensemble learning
- Stylometry
- Transformer model
- Voting classifier
- Author profiling
Fuente:
Tipo de documento:
Conference Object
Estado:
Acceso restringido
Áreas de conocimiento:
- Análisis de datos
- Comunicación
Áreas temáticas:
- Funcionamiento de bibliotecas y archivos
- Interacción social
- Lingüística