Data mining for grammatical inference with bioinformatics criteria


Abstract:

In this work a novel data mining process is described that combines hybrid techniques of association analysis and classical sequentiation algorithms of genomics, to generate grammatical structures of a specific language. Subsequently, these structures are converted to Context-Free Grammars. Initially the method applies to context-free languages with the possibility of being applied to other languages: structured programming, the language of the book of life expressed in the genome and proteome and even the natural languages. We used an application of a compilers generator system that allows the development of a practical application within the area of grammarware, where the concepts of the language analysis are applied to other disciplines, like bioinformatic. The tool allows measuring the complexity of the obtained grammar automatically from textual data. © 2011 Elsevier Ltd. All rights reserved.

Año de publicación:

2012

Keywords:

  • Free Context Grammar
  • Grammatical inference
  • DNA
  • Bioinformatic
  • Sequential patterns

Fuente:

scopusscopus

Tipo de documento:

Article

Estado:

Acceso restringido

Áreas de conocimiento:

  • Minería de datos
  • Ciencias de la computación

Áreas temáticas:

  • Métodos informáticos especiales