Regresar

Crowdsourced corpus with entity salience annotations

Abstract:

In this paper, we present a crowdsourced dataset which adds entity salience (importance) annotations to the Reuters-128 dataset, which is subset of Reuters-21578. The dataset is distributed under a free license and publish in the NLP Interchange Format, which fosters interoperability and re-use. We show the potential of the dataset on the task of learning an entity salience classifier and report on the results from several experiments.

Año de publicación:

2016

Keywords:

Named entities
Entity importance
text analysis
Entity salience
Document aboutness

Fuente:

scopus

Tipo de documento:

Conference Object

Estado:

Acceso restringido

Áreas de conocimiento:

Aprendizaje automático
Ciencias de la computación

Áreas temáticas de Dewey:

Funcionamiento de bibliotecas y archivos
Miscelánea filosófica
Lingüística

Procesado con IA

Objetivos de Desarrollo Sostenible:

ODS 17: Alianzas para lograr los objetivos
ODS 10: Reducción de las desigualdades
ODS 9: Industria, innovación e infraestructura

Procesado con IA

Contribuidores:

Kliegr T.

Reddy D.

Dojchinovski M.

Dojchinovski M.

Tomas Vitvar

Sack H.