Expression Analysis Pipeline: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
Line 3: | Line 3: | ||
When a FASTQ file of sequence reads is loaded in [[LoadExperiment]] and associated with an annotated genome, the following analysis steps are performed: | When a FASTQ file of sequence reads is loaded in [[LoadExperiment]] and associated with an annotated genome, the following analysis steps are performed: | ||
# The FASTQ file is verified for correct format. | # The FASTQ file is verified for correct format. | ||
# [http://code.google.com/p/cutadapt/ CutAdapt] is run to trim adapter sequence from the reads (parameters: -q 25 --quality-base=64 -m 17) | # [http://code.google.com/p/cutadapt/ CutAdapt] is run to trim adapter sequence from the reads (parameters: -q 25 --quality-base=64 -m 17). | ||
# [http://research-pub.gene.com/gmap/ GMAP] is run to index the reference genome sequence. | # [http://research-pub.gene.com/gmap/ GMAP] is run to index the reference genome sequence. | ||
# [http://research-pub.gene.com/gmap/ GSNAP] is run to align the reads to the reference sequence. | # [http://research-pub.gene.com/gmap/ GSNAP] is run to align the reads to the reference sequence (parameters: --nthreads=32 -n 5 --format=sam -Q --gmap-mode=none --nofails). | ||
# [http://samtools.sourceforge.net/ SAMtools] is run to compute per-position read depth of the resulting alignment. | # [http://samtools.sourceforge.net/ SAMtools] is run to compute per-position read depth of the resulting alignment. | ||
# [http://cufflinks.cbcb.umd.edu/ Cufflinks] is run to compte per-transcript FPKM. | # [http://cufflinks.cbcb.umd.edu/ Cufflinks] is run to compte per-transcript FPKM. | ||
# The three results (raw alignment, per-position read depth, and per-transcript FPKM) are loaded as separate [[Experiments]] into a [[Notebook]]. | # The three results (raw alignment, per-position read depth, and per-transcript FPKM) are loaded as separate [[Experiments]] into a [[Notebook]]. |
Revision as of 18:13, 27 February 2014
CoGe can generate gene/transcript expression measurements given a FASTQ input and an annotated genome.
When a FASTQ file of sequence reads is loaded in LoadExperiment and associated with an annotated genome, the following analysis steps are performed:
- The FASTQ file is verified for correct format.
- CutAdapt is run to trim adapter sequence from the reads (parameters: -q 25 --quality-base=64 -m 17).
- GMAP is run to index the reference genome sequence.
- GSNAP is run to align the reads to the reference sequence (parameters: --nthreads=32 -n 5 --format=sam -Q --gmap-mode=none --nofails).
- SAMtools is run to compute per-position read depth of the resulting alignment.
- Cufflinks is run to compte per-transcript FPKM.
- The three results (raw alignment, per-position read depth, and per-transcript FPKM) are loaded as separate Experiments into a Notebook.