Most genomes within CoGe are annotated only with publicly available gene models. However, the Freeling Lab has also decorated a number of genomes with additional information. For reference the specific versions with extra annotation information are listed here.
Conserved Noncoding Sequence Data
Conserved noncoding sequences are identified by comparing the non-exon regions surrounding orthologous genes in two species or homeologous genes within a single species. Most CNS datasets in CoGe were generated using the CNS Discovery Pipeline.
|Species||Data set group ID||Data set ID||Comparison||Method||Full Name|
|Arabidopsis||3||39598||Arabidopsis (homeologs)||Manual Annotations and the CNS Discovery Pipeline||Arabidopsis thaliana Col-0 (thale cress) (with CNS) masked repeats 50|
|Peach||8400||42478||Chocolate (orthologs)||CNS Discovery Pipeline||Prunus persica (peach) (with CNS) unmasked|
|Chocolate||10997||46486||Peach (orthologs)||CNS Discovery Pipeline||Theobroma cacao (chocolate) Belizian Criollo genotype (B97-61/B2) (with CNS)|
|Rice||11822||47668||Sorghum (orthologs)||CNS Discovery Pipeline||Oryza sativa japonica (Rice) (with CNS) masked repeats 50x|
|Sorghum||11821||47667||Rice (orthologs)||CNS Discovery Pipeline||Sorghum bicolor (with CNS) masked repeats 50x|