Difference between revisions of "ChIP-seq Analysis Pipeline"

From CoGepedia
Jump to: navigation, search
m (Workflow)
m
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
<span style="color:red">Note: this document is a draft and still under revision.</span>
 +
 
CoGe can analyze [https://en.wikipedia.org/wiki/ChIP-sequencing chromatin immunoprecipitation sequence (ChIP-seq)] using the software package [http://homer.salk.edu/homer/ Homer].
 
CoGe can analyze [https://en.wikipedia.org/wiki/ChIP-sequencing chromatin immunoprecipitation sequence (ChIP-seq)] using the software package [http://homer.salk.edu/homer/ Homer].
  
Line 14: Line 16:
  
 
# Trim FASTQ files (optional)
 
# Trim FASTQ files (optional)
# Align FASTQ files to reference genome sequence -- these steps depend on which alignment software tool is selected (GSNAP, Bowtie, etc)
+
# Align FASTQ files to reference genome sequence using selected alignment software tool (GSNAP, Bowtie, etc)
 
## Build index of reference sequence
 
## Build index of reference sequence
 
## Individually map FASTQ files to reference
 
## Individually map FASTQ files to reference
Line 22: Line 24:
  
 
==Outputs==
 
==Outputs==
 +
 +
The pipeline produces 5 outputs (represented as "Experiments" in CoGe):
 +
* Three BAM files corresponding to each FASTQ input mapped to the reference genome sequence
 +
* Two peaks tracks corresponding to the input analyzed with respect to each replicate.

Latest revision as of 10:53, 4 March 2016

Note: this document is a draft and still under revision.

CoGe can analyze chromatin immunoprecipitation sequence (ChIP-seq) using the software package Homer.

See the LoadExperiment tool to use the new pipeline.

This analysis pipeline was developed by Xiang Ju (in the lab of Brian Gregory at UPenn).

Inputs

Three FASTQ input files are required:

  • input
  • two replicates

Workflow Summary

  1. Trim FASTQ files (optional)
  2. Align FASTQ files to reference genome sequence using selected alignment software tool (GSNAP, Bowtie, etc)
    1. Build index of reference sequence
    2. Individually map FASTQ files to reference
  3. Create tag directories (Homer)
  4. Find peaks (Homer)
  5. Load results

Outputs

The pipeline produces 5 outputs (represented as "Experiments" in CoGe):

  • Three BAM files corresponding to each FASTQ input mapped to the reference genome sequence
  • Two peaks tracks corresponding to the input analyzed with respect to each replicate.