An assessment of genome annotation coverage across the bacterial tree of life


Abstract:

Although gene-finding in bacterial genomes is relatively straightforward, the automated assignment of gene function is still challenging, resulting in a vast quantity of hypothetical sequences of unknown function. But how prevalent are hypothetical sequences across bacteria, what proportion of genes in different bacterial genomes remain unannotated, and what factors affect annotation completeness? To address these questions, we surveyed over 27 000 bacterial genomes from the Genome Taxonomy Database, and measured genome annotation completeness as a function of annotation method, taxonomy, genome size,'research bias' and publication date. Our analysis revealed that 52 and 79% of the average bacterial proteome could be functionally annotated based on protein and domain-based homology searches, respectively. Annotation coverage using protein homology search varied significantly from as …

Año de publicación:

2020

Keywords:

    Fuente:

    googlegoogle

    Tipo de documento:

    Other

    Estado:

    Acceso abierto

    Áreas de conocimiento:

    • Microbiología

    Áreas temáticas de Dewey:

    • Biología
    • Bioquímica
    • Microorganismos, hongos y algas
    Procesado con IAProcesado con IA

    Objetivos de Desarrollo Sostenible:

    • ODS 9: Industria, innovación e infraestructura
    • ODS 17: Alianzas para lograr los objetivos
    • ODS 4: Educación de calidad
    Procesado con IAProcesado con IA

    Contribuidores: