An assessment of genome annotation coverage across the bacterial tree of life
Abstract:
Although gene-finding in bacterial genomes is relatively straightforward, the automated assignment of gene function is still challenging, resulting in a vast quantity of hypothetical sequences of unknown function. But how prevalent are hypothetical sequences across bacteria, what proportion of genes in different bacterial genomes remain unannotated, and what factors affect annotation completeness? To address these questions, we surveyed over 27 000 bacterial genomes from the Genome Taxonomy Database, and measured genome annotation completeness as a function of annotation method, taxonomy, genome size,'research bias' and publication date. Our analysis revealed that 52 and 79% of the average bacterial proteome could be functionally annotated based on protein and domain-based homology searches, respectively. Annotation coverage using protein homology search varied significantly from as …
Año de publicación:
2020
Keywords:
Fuente:

Tipo de documento:
Other
Estado:
Acceso abierto
Áreas de conocimiento:
- Microbiología
Áreas temáticas de Dewey:
- Biología
- Bioquímica
- Microorganismos, hongos y algas

Objetivos de Desarrollo Sostenible:
- ODS 9: Industria, innovación e infraestructura
- ODS 17: Alianzas para lograr los objetivos
- ODS 4: Educación de calidad
