Difference between revisions of "Creosote"

From CoGepedia
Jump to: navigation, search
(elyons@icoge (~/projects/genome/data/creosote/Sample_lane3) $ python -m jcvi.apps.baseclean trim lane3_NoIndex_L003_R1_001.fastq lane3_NoIndex_L003_R2_001.fastq 14:00:38 [base::DEBUG] wget http://www.)
Line 12: Line 12:
 
*Check quality with fastqc: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/
 
*Check quality with fastqc: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/
 
**[[Creosote First Run FastQC]]
 
**[[Creosote First Run FastQC]]
 +
 
*Sequences cleaned using trimReads by Haibao Tang: https://github.com/tanghaibao/trimReads/tree/
 
*Sequences cleaned using trimReads by Haibao Tang: https://github.com/tanghaibao/trimReads/tree/
 +
**NOte:  Only use on single reads
 
**Ran with supplied adapter sequence file:
 
**Ran with supplied adapter sequence file:
 
  >Adapter 4
 
  >Adapter 4
 
  TGACCA
 
  TGACCA
 +
>Adapter 4 rc
 +
TGGTCA
 
**Command-line run:
 
**Command-line run:
  /home/elyons/bin/trimReads  -Q 33 -f /home/elyons/projects/genome/data/creosote/src/adapters.fasta ./lane3_NoIndex_L003_R1_033.fastq
+
  Running /home/elyons/bin/trimReads  -Q 33 -f /home/elyons/projects/genome/data/creosote/Sample_lane3/adapter/adapter.faa ./lane3_NoIndex_L003_R2_015.fastq
 
**Output of trimReads:
 
**Output of trimReads:
[0] Illumina_PE-1 found 54 times
+
 
[1] Illumina_PE-2 found 3 times
+
 
[2] Illumina_PE-1rc found 2850 times
+
 
[3] Illumina_PE-2rc found 12 times
+
*Trim Paired ends with Trimmomatic: http://www.usadellab.org/cms/index.php?page=trimmomatic
+
*Assumes Illumina Encoding (code: 64, not code: 33)
  A total of 92003 too short (trimmed length < 30) reads removed.
+
**Need to convert for the HighSeq Reads:
A total of 949092 trimmed reads are written to `./lane3_NoIndex_L003_R2_041.trimmed.fastq`.
+
** easy_install biopython
Processed 1041095 sequences took 1557.84 seconds.
+
** git clone git://github.com/tanghaibao/jcvi.git
***Appears to not have the correct linkers as I would assume to see more removed
+
** export PYTHONPATH=/lib/python (which is the dir above jcvi)
 +
** python -m jcvi.formats.fastq (Install missing packages)
 +
 
 +
Steps:
 +
*Merge R1 files; merge R2 files
 +
*gzip them
 +
*Run this: python -m jcvi.apps.baseclean trim R1.fastq.gz R2.fastq.gz
 +
**NOTE: This program should download trimmomatic, but may need to update the path of the timmomatic program in the program
 +
*Note:  Bao recommends CLC for genome assembly.  Runs faster, less memory, less sensitive to bad data. Compute intensive.
 +
 
 +
 
 +
*python -m jcvi.formats.fastq convert  (read help file, default converstion
 +
*python -m jcvi.apps.baseclean trim fastqfile (single ended)
 +
*python -m jcvi.apps.baseclean trim R1.fastq.gz R2.fastq.gz (paired ended)
 +
 
 +
*Cat all the R1s together
 +
*Cat all the R2s together

Revision as of 13:55, 4 August 2011

Creosote genome sequencing and assembly notes:

>Adapter 4
TGACCA
>Adapter 4 rc
TGGTCA
    • Command-line run:
Running /home/elyons/bin/trimReads  -Q 33 -f /home/elyons/projects/genome/data/creosote/Sample_lane3/adapter/adapter.faa ./lane3_NoIndex_L003_R2_015.fastq
    • Output of trimReads:


Steps:

  • Merge R1 files; merge R2 files
  • gzip them
  • Run this: python -m jcvi.apps.baseclean trim R1.fastq.gz R2.fastq.gz
    • NOTE: This program should download trimmomatic, but may need to update the path of the timmomatic program in the program
  • Note: Bao recommends CLC for genome assembly. Runs faster, less memory, less sensitive to bad data. Compute intensive.


  • python -m jcvi.formats.fastq convert (read help file, default converstion
  • python -m jcvi.apps.baseclean trim fastqfile (single ended)
  • python -m jcvi.apps.baseclean trim R1.fastq.gz R2.fastq.gz (paired ended)
  • Cat all the R1s together
  • Cat all the R2s together