Expression Analysis Pipeline
CoGe can generate gene/transcript expression measurements given a FASTQ input and an annotated genome.
When a FASTQ file of sequence reads is loaded in LoadExperiment and associated with an annotated genome, the following analysis steps are performed:
- The FASTQ file is verified for correct format.
- CutAdapt is run to trim adapter sequence from the reads (parameters: -q 25 --quality-base=64 -m 17)
- GMAP is run to index the reference genome sequence.
- GSNAP is run to align the reads to the reference sequence.
- SAMtools is run to compute per-position read depth of the resulting alignment.
- Cufflinks is run to compte per-transcript FPKM.
- The three results (raw alignment, per-position read depth, and per-transcript FPKM) are loaded as separate Experiments into a Notebook.