Regresar

An insight into structure and composition of the fig genome

Abstract:

Ficus carica L. is a diploid species, with a genome size of 0.36 pg/2C, still poorly characterized at genetic and genomic level. With the aim of analysing the fig genome structure, we used Illumina technology to produce 25.64 genome equivalents of 35-511 nt long MiSeq sequences and 12.96 genome equivalents of 25-100 nt long HiSeq paired-end reads. The two libraries were subject to a first assembly run separately, then a hybrid assembly was performed; finally, contigs and supercontigs were scaffolded. This first rough assembly is composed of 264,088 scaffolds, up to 41,760 nt in length, covering 323,708,138 nt, that corresponds to 87.5% of the fig genome, with N50 = 2,523. Masking the scaffolds with a transcriptome of Rosaceae, from which sequences related to repetitive elements were removed, allowed us to establish that coding genes account for at least 6.8% of the fig genome. Gene prediction analysis produced 44,419 putative genes. A sample of around 5,000 predicted genes were annotated with regard to gene ontology and function. Concerning the repetitive component, the fig genome resulted composed for around 58% of repeated sequences, of which none was especially redundant. Among identified repeats, the most represented were LTR-retrotransposons, with Gypsy elements more frequent than Copia.