An Unsupervised Learning Approach for Automatically to Categorize Potential Suicide Messages in Social Media


Abstract:

In this paper, we present an approach to categorize potential suicide messages in social media which is based on unsupervised learning. Our approach has five phases: the first two correspond to data acquisition and pre-processing where texts available in a corpus for suicide detection were taken and converted into a structured format; in the third phase, similarity between texts are computed using semantic similarity measures; traditional clustering algorithms were used to identify categories of potential suicide messages in the fourth phase; and, in last phase, using validation metrics we verified the usefulness of our approach to replicate the allocation of text into categories as in the original corpus data. Computational results showed that our approach is able to replicate the grouping of messages labeled as 'No risk' and 'Risk' in average rates of 79 % and 87 % and rates up 13 % and 9 % in alert levels for English and Spanish, respectively.

Año de publicación:

2019

Keywords:

  • suicide
  • unsupervised learning
  • Corpus
  • Clustering
  • Automatic annotation
  • Social media

Fuente:

googlegoogle
scopusscopus

Tipo de documento:

Conference Object

Estado:

Acceso restringido

Áreas de conocimiento:

  • Psicopatología
  • Aprendizaje automático
  • Redes sociales

Áreas temáticas:

  • Métodos informáticos especiales
  • Interacción social
  • Enfermedades