Cacao Matina1-6 Genome v0.9

Analysis Name Cacao Matina1-6 Genome v0.9
Software Customized version of the Arachne2 assembler (in-house)
Source 454 Titanium STD reads (USDA-ARS, Stoneville)
Date performed 2021-03-28

Theobroma cacao Matina1-6 v0.9 Preliminary Release

From this page you can read about the genome sequence assembly statistics and annotation protocol, browse and download the whole v0.9 genome sequence; predicted gene transcripts, their locations and putative function based on homology to known genes in Arabidopsis and SwissProt. You can BLAST your sequences against the genome sequence, and the predicted gene transcripts and peptides. You can search and browse the chromosomes, contigs, markers, and genes in GBrowse and view the evidence for each prediction and feature.

Genome v0.9 Statistics
The v0.9 preliminary release of the Theobroma cacao Matina1-6 genome is comprised of 1782 supercontigs. The first ten supercontigs represent the ten pseudomolecules (chromosomes) and account for 92.1% of the genome. A total of 34,997 consensus genes are  predicted in this release.  These consensus genes are designated with Cacao Genome Database descriptors (e.g. cgd0000001 to cgd0034997). The submission of genes to NCBI, will likely follow the next release, and at that point the genes will be designated with Theobroma cacao Matina descriptors (tcm).

More Information


Organisms Theobroma cacao
Cacao Genome Sequence (v0.9)
Supercontigs (FASTA file, 98MB compressed) Cacao_genome_v0.9_supercontigs.fa.gz
Supercontigs  w/ masked repeats (FASTA file, 91MB compressed) Cacao_genome_v0.9_supercontigs_RM.fa.gz
Supercontigs  (GFF3 file, 86MB compressed) Cacao_genome_v0.9_supercontigs.gff3.gz

 Cacao Genome Genome Annotation (v0.9)
Predicted gene transcripts (FASTA file, 21MB compressed) Cacao_genome_v0.9_transcript.fa.gz
Predicted gene peptides (FASTA file, 9MB compressed) Cacao_genome_v0.9_peptide.fa.gz
Predicted gene models (GFF3 file, 19MB compressed) Cacao_genome_v0.9_gene.gff3.gz

Cacao predicted genes blastx vs protein databases. Best hit reports in Excel
UniProtKB/Swiss-Prot (14MB) Cacao_genome_v0.9_vs_sprot.xls
UniProtKB/TrEMBL (21MB) Cacao_genome_v0.9_vs_TrEMBL.xls
TAIR9 (arabidopsis) proteins (28MB) Cacao_genome_v0.9_vs_arabidopsis.xls
Prunus persica (peach) v1.0 proteins (35MB) Cacao_genome_v0.9_vs_peach.xls
Vitis vinifera (grape) proteins (25MB) Cacao_genome_v0.9_vs_grape.xls
Populus trichocarpa (poplar) v2.0 proteins (26MB) Cacao_genome_v0.9_vs_poplar.xls