Expression Analysis Pipeline: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
CoGe can generate gene/transcript expression measurements given a FASTQ input and an annotated genome. Thanks to [http://www.skraelingmountain.com/ James Schable], creator of [http://qteller.com/ qTeller], for help developing this pipeline! | CoGe can generate gene/transcript expression measurements given a FASTQ input and an annotated genome. Thanks to [http://www.skraelingmountain.com/ James Schable], creator of [http://qteller.com/ qTeller], for help developing this pipeline! | ||
[[File:Screen_Shot_2014-03-04_at_9.50.01_AM.png|thumb|center|300px]] | |||
When a FASTQ file of sequence reads is loaded in [[LoadExperiment]] and associated with an annotated genome, the following analysis steps are performed: | When a FASTQ file of sequence reads is loaded in [[LoadExperiment]] and associated with an annotated genome, the following analysis steps are performed: | ||
Line 13: | Line 15: | ||
Genomes for which this analysis has been performed can have features imported into [http://qteller.com/ qTeller]. | Genomes for which this analysis has been performed can have features imported into [http://qteller.com/ qTeller]. | ||
TBD: how to do this ... | TBD: how to do this ... | ||
Revision as of 16:56, 4 March 2014
CoGe can generate gene/transcript expression measurements given a FASTQ input and an annotated genome. Thanks to James Schable, creator of qTeller, for help developing this pipeline!

When a FASTQ file of sequence reads is loaded in LoadExperiment and associated with an annotated genome, the following analysis steps are performed:
- The FASTQ file is verified for correct format.
- CutAdapt is run to trim adapter sequence from the reads (parameters: -q 25 --quality-base=64 -m 17).
- GMAP is run to index the reference genome sequence.
- GSNAP is run to align the reads to the reference sequence (parameters: --nthreads=32 -n 5 --format=sam -Q --gmap-mode=none --nofails).
- SAMtools is run to compute per-position read depth of the resulting alignment (mpileup -D -Q 20).
- Cufflinks is run to compte per-transcript FPKM (parameters: -p 24).
- The per-position read depth and per-transcript FPKM values are log transformed and normalized between 0 and 1 for loading.
- The three results (raw alignment, per-position read depth, and per-transcript FPKM) are loaded as separate Experiments into a Notebook.
Genomes for which this analysis has been performed can have features imported into qTeller. TBD: how to do this ...