Efficient techniques to explore and rank paths in life science data sources
Abstract:
Life science data sources represent a complex link-driven federation of publicly available Web accessible sources. A fundamental need for scientists today is the ability to completely explore all relationships between scientific classes, e.g., genes and citations, that may be retrieved from various data sources. A challenge to such exploration is that each path between data sources potentially has different domain specific semantics and yields different benefit to the scientist. Thus, it is important to efficiently explore paths so as to generate paths with the highest benefits. In this paper, we explore the search space of paths that satisfy queries expressed as regular expressions. We propose an algorithm ESearch that runs in polynomial time in the size of the graph when the graph is acyclic. We present expressions to determine the benefit of a path based on metadata (statistics). We develop a heuristic search OnlyBestXX%. Finally, we compare OnlyBestXX% and ESearch. © Springer-Verlag 2004.
Año de publicación:
2004
Keywords:
Fuente:
Tipo de documento:
Article
Estado:
Acceso restringido
Áreas de conocimiento:
- Minería de datos
- Ciencias de la computación
Áreas temáticas:
- Funcionamiento de bibliotecas y archivos
- Tecnología (Ciencias aplicadas)
- Medicina y salud