Creosote: Difference between revisions

From CoGepedia
Jump to navigation Jump to search
No edit summary
elyons@icoge (~/projects/genome/data/creosote/Sample_lane3) $ python -m jcvi.apps.baseclean trim lane3_NoIndex_L003_R1_001.fastq lane3_NoIndex_L003_R2_001.fastq 14:00:38 [base::DEBUG] wget http://www.
Line 12: Line 12:
*Check quality with fastqc: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/
*Check quality with fastqc: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/
**[[Creosote First Run FastQC]]
**[[Creosote First Run FastQC]]
*Sequences cleaned using trimReads by Haibao Tang: https://github.com/tanghaibao/trimReads/tree/
*Sequences cleaned using trimReads by Haibao Tang: https://github.com/tanghaibao/trimReads/tree/
**NOte:  Only use on single reads
**Ran with supplied adapter sequence file:
**Ran with supplied adapter sequence file:
  >Adapter 4
  >Adapter 4
  TGACCA
  TGACCA
>Adapter 4 rc
TGGTCA
**Command-line run:
**Command-line run:
  /home/elyons/bin/trimReads  -Q 33 -f /home/elyons/projects/genome/data/creosote/src/adapters.fasta ./lane3_NoIndex_L003_R1_033.fastq
  Running /home/elyons/bin/trimReads  -Q 33 -f /home/elyons/projects/genome/data/creosote/Sample_lane3/adapter/adapter.faa ./lane3_NoIndex_L003_R2_015.fastq
**Output of trimReads:
**Output of trimReads:
[0] Illumina_PE-1 found 54 times
 
[1] Illumina_PE-2 found 3 times
 
[2] Illumina_PE-1rc found 2850 times
 
[3] Illumina_PE-2rc found 12 times
*Trim Paired ends with Trimmomatic: http://www.usadellab.org/cms/index.php?page=trimmomatic
*Assumes Illumina Encoding (code: 64, not code: 33)
  A total of 92003 too short (trimmed length < 30) reads removed.
**Need to convert for the HighSeq Reads:
  A total of 949092 trimmed reads are written to `./lane3_NoIndex_L003_R2_041.trimmed.fastq`.
** easy_install biopython
Processed 1041095 sequences took 1557.84 seconds.
** git clone git://github.com/tanghaibao/jcvi.git
***Appears to not have the correct linkers as I would assume to see more removed
** export PYTHONPATH=/lib/python (which is the dir above jcvi)
** python -m jcvi.formats.fastq (Install missing packages)
 
Steps:
*Merge R1 files; merge R2 files
*gzip them
*Run this: python -m jcvi.apps.baseclean trim R1.fastq.gz R2.fastq.gz
**NOTE: This program should download trimmomatic, but may need to update the path of the timmomatic program in the program
*Note:  Bao recommends CLC for genome assembly. Runs faster, less memory, less sensitive to bad data.  Compute intensive.
 
 
*python -m jcvi.formats.fastq convert  (read help file, default converstion
*python -m jcvi.apps.baseclean trim fastqfile (single ended)
*python -m jcvi.apps.baseclean trim R1.fastq.gz R2.fastq.gz (paired ended)
 
*Cat all the R1s together
*Cat all the R2s together

Revision as of 20:55, 4 August 2011

Creosote genome sequencing and assembly notes:

>Adapter 4
TGACCA
>Adapter 4 rc
TGGTCA
    • Command-line run:
Running /home/elyons/bin/trimReads  -Q 33 -f /home/elyons/projects/genome/data/creosote/Sample_lane3/adapter/adapter.faa ./lane3_NoIndex_L003_R2_015.fastq
    • Output of trimReads:


Steps:

  • Merge R1 files; merge R2 files
  • gzip them
  • Run this: python -m jcvi.apps.baseclean trim R1.fastq.gz R2.fastq.gz
    • NOTE: This program should download trimmomatic, but may need to update the path of the timmomatic program in the program
  • Note: Bao recommends CLC for genome assembly. Runs faster, less memory, less sensitive to bad data. Compute intensive.


  • python -m jcvi.formats.fastq convert (read help file, default converstion
  • python -m jcvi.apps.baseclean trim fastqfile (single ended)
  • python -m jcvi.apps.baseclean trim R1.fastq.gz R2.fastq.gz (paired ended)
  • Cat all the R1s together
  • Cat all the R2s together