SINAI at PoliticEs 2022: Exploring Relative Frequency of Words in Stylometrics for Profile Discovery


Abstract:

In this article we summarise our participation in the PoliticEs task within the IberLEF evaluation forum in its 2022 edition. This task is entitled Spanish Author Profiling for Political Ideology. We proposed a Voting Classifier model that leverages the use of several classical classifiers using as features the combination of stylometry measures with embeddings obtained from a Spanish RoBERTa model for text representation. Our final work achieved an F1 score of 0.785 for Gender pbkp_rediction, 0.753 for Profession, 0.784 for Ideology_Binary and 0.561 for Ideology_Multiclass, with a final macro average for F1 of 0.721. These results indicate that the combination of stylometric features can be useful in the determination of user profiles.

Año de publicación:

2022

Keywords:

  • ensemble learning
  • Stylometry
  • Transformer model
  • Voting classifier
  • Author profiling

Fuente:

scopusscopus
googlegoogle

Tipo de documento:

Conference Object

Estado:

Acceso restringido

Áreas de conocimiento:

  • Análisis de datos
  • Comunicación

Áreas temáticas:

  • Funcionamiento de bibliotecas y archivos
  • Interacción social
  • Lingüística