Observing lemmatization effect in LSA coherence and comprehension grading of learner summaries


Abstract:

Current work in learner evaluation of Intelligent Tutoring Systems (ITSs), is moving towards open-ended educational content diagnosis. One of the main difficulties of this approach is to be able to automatically understand natural language. Our work is directed to produce automatic evaluation of learner summaries in Basque. Therefore, in addition to language comprehension, difficulties emerge from Basque morphology itself. In this work, Latent Semantic Analysis (LSA) is used to model comprehension in a language in which lemmatization has shown to be highly significant. This paper tests the influence of corpus lemmatization while performing automatic comprehension and coherence grading. Summaries graded by human judges in coherence and comprehension, have been tested against LSA based measures from source lemmatized and non-lemmatized corpora. After lemmatization, the amount of LSA known single terms was reduced in a 56% of its original number. As a result, LSA grades almost match human measures, producing no significant differences between the lemmatized and non-lemmatized approaches. © Springer-Verlag Berlin Heidelberg 2006.

Año de publicación:

2006

Keywords:

    Fuente:

    scopusscopus

    Tipo de documento:

    Conference Object

    Estado:

    Acceso restringido

    Áreas de conocimiento:

      Áreas temáticas:

      • Funcionamiento de bibliotecas y archivos
      • Interacción social
      • Lingüística aplicada