Difference between revisions of "Expression Analysis Pipeline"

From CoGepedia
Jump to: navigation, search
(Created page with 'CoGe can generate gene/transcript expression measurements given a FASTQ input and an annotated genome. When a FASTQ file of sequence reads is loaded in LoadExperiment and ...')
 
Line 3: Line 3:
 
When a FASTQ file of sequence reads is loaded in [[LoadExperiment]] and associated with an annotated genome, the following analysis steps are performed:
 
When a FASTQ file of sequence reads is loaded in [[LoadExperiment]] and associated with an annotated genome, the following analysis steps are performed:
 
# The FASTQ file is verified for correct format.
 
# The FASTQ file is verified for correct format.
# [https://code.google.com/p/cutadapt/ CutAdapt] is run to trim adapter sequence from the reads.
+
# [http://code.google.com/p/cutadapt/ CutAdapt] is run to trim adapter sequence from the reads.
 
# [http://research-pub.gene.com/gmap/ GMAP] is run to index the reference genome sequence.
 
# [http://research-pub.gene.com/gmap/ GMAP] is run to index the reference genome sequence.
 
# [http://research-pub.gene.com/gmap/ GSNAP] is run to align the reads to the reference sequence.
 
# [http://research-pub.gene.com/gmap/ GSNAP] is run to align the reads to the reference sequence.

Revision as of 12:10, 27 February 2014

CoGe can generate gene/transcript expression measurements given a FASTQ input and an annotated genome.

When a FASTQ file of sequence reads is loaded in LoadExperiment and associated with an annotated genome, the following analysis steps are performed:

  1. The FASTQ file is verified for correct format.
  2. CutAdapt is run to trim adapter sequence from the reads.
  3. GMAP is run to index the reference genome sequence.
  4. GSNAP is run to align the reads to the reference sequence.
  5. SAMtools is run to compute per-position read depth of the resulting alignment.
  6. Cufflinks is run to compte per-transcript FPKM.
  7. The three results (raw alignment, per-position read depth, and per-transcript FPKM) are loaded as separate Experiments into a Notebook.