GFF ingestion: Difference between revisions

From CoGepedia
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
[[File:Screen shot 2012-04-17 at 1.10.42 PM.png|thumb|right|400px|CoGe visualization of [[genomic feature]] from the rice genome]]
[[File:Screen shot 2012-04-17 at 1.10.42 PM.png|thumb|right|400px|CoGe visualization of [[genomic feature]] from the rice genome]]


CoGe translates many of the features from a standard GFF file ([http://www.sequenceontology.org/resources/gff3.html specification]) into different [[genomic features]] in CoGe's database.  For a basic protein coding gene, CoGe tracks three major genomic features:
CoGe translates many of the features from a [http://www.sequenceontology.org/resources/gff3.html standard GFF file] into different [[genomic features]] in CoGe's database.  For a basic protein coding gene, CoGe tracks three major genomic features:
*[[gene]]:  the full extent of the transcribed unit including introns
*[[gene]]:  the full extent of the transcribed unit including introns
*[[mRNA]]:  the spliced transcript
*[[mRNA]]:  the spliced transcript

Revision as of 23:43, 7 July 2014

CoGe visualization of genomic feature from the rice genome

CoGe translates many of the features from a standard GFF file into different genomic features in CoGe's database. For a basic protein coding gene, CoGe tracks three major genomic features:

  • gene: the full extent of the transcribed unit including introns
  • mRNA: the spliced transcript
  • CDS: the regions that code for protein.

From the GFF3 entry below, the gene and mRNA features are collapsed to a gene in CoGe, the exons are combined to make an mRNA in CoGe, and the CDSs are used as a CDS feature in CoGe. The UTRs are skipped as being redundant with the exons.

Example GFF entry for a protein coding gene from the rice genome (v7)

Chr1    MSU_osa1r7      gene    12648   15915   .       +       .       ID=LOC_Os01g01030;Name=LOC_Os01g01030;Note=monocopper%20oxidase%2C%20putative%2C%20expressed
Chr1    MSU_osa1r7      mRNA    12648   15915   .       +       .       ID=LOC_Os01g01030.1;Name=LOC_Os01g01030.1;Parent=LOC_Os01g01030
Chr1    MSU_osa1r7      exon    12648   13813   .       +       .       ID=LOC_Os01g01030.1:exon_1;Parent=LOC_Os01g01030.1
Chr1    MSU_osa1r7      exon    13906   14271   .       +       .       ID=LOC_Os01g01030.1:exon_2;Parent=LOC_Os01g01030.1
Chr1    MSU_osa1r7      exon    14359   14437   .       +       .       ID=LOC_Os01g01030.1:exon_3;Parent=LOC_Os01g01030.1
Chr1    MSU_osa1r7      exon    14969   15171   .       +       .       ID=LOC_Os01g01030.1:exon_4;Parent=LOC_Os01g01030.1
Chr1    MSU_osa1r7      exon    15266   15915   .       +       .       ID=LOC_Os01g01030.1:exon_5;Parent=LOC_Os01g01030.1
Chr1    MSU_osa1r7      five_prime_UTR  12648   12773   .       +       .       ID=LOC_Os01g01030.1:utr_1;Parent=LOC_Os01g01030.1
Chr1    MSU_osa1r7      CDS     12774   13813   .       +       .       ID=LOC_Os01g01030.1:cds_1;Parent=LOC_Os01g01030.1
Chr1    MSU_osa1r7      CDS     13906   14271   .       +       .       ID=LOC_Os01g01030.1:cds_2;Parent=LOC_Os01g01030.1
Chr1    MSU_osa1r7      CDS     14359   14437   .       +       .       ID=LOC_Os01g01030.1:cds_3;Parent=LOC_Os01g01030.1
Chr1    MSU_osa1r7      CDS     14969   15171   .       +       .       ID=LOC_Os01g01030.1:cds_4;Parent=LOC_Os01g01030.1
Chr1    MSU_osa1r7      CDS     15266   15359   .       +       .       ID=LOC_Os01g01030.1:cds_5;Parent=LOC_Os01g01030.1
Chr1    MSU_osa1r7      three_prime_UTR 15360   15915   .       +       .       ID=LOC_Os01g01030.1:utr_2;Parent=LOC_Os01g01030.1