Difference between revisions of "Marijuana assembly"

From CoGepedia
Jump to: navigation, search
(Convert files)
(Convert files)
 
Line 12: Line 12:
  
 
== Convert files ==
 
== Convert files ==
Using seqret from EMBOSS.  If fresh install, don't forget "sudo ldconfig" to load libraries.
+
Using seqret from EMBOSS.  If fresh install, don't forget "sudo ldconfig" to load libraries.
  
  zcat R1_all.fastq.gz | seqret fastq-sanger::stdin fastq-illumina::stdout | gzip > R1_converted.fastq.gz
+
zcat R1_all.fastq.gz | seqret fastq-sanger::stdin fastq-illumina::stdout | gzip > R1_converted.fastq.gz
  zcat R2_all.fastq.gz | seqret fastq-sanger::stdin fastq-illumina::stdout | gzip > R2_converted.fastq.gz
+
zcat R2_all.fastq.gz | seqret fastq-sanger::stdin fastq-illumina::stdout | gzip > R2_converted.fastq.gz

Latest revision as of 09:13, 20 August 2011

Obtain raw reads

Sequences obtained from: http://csativa.elasticbeanstalk.com/

Info:

The sequence data is derived from an ILMN HiSeq v2.0 chemistry with 2x100 reads. There are 7 Lanes in total which add up to 131Gb of sequence. 
The genome is estimated to be 400Mb thus an estimated 327X coverage. 

Merge read files

cat *_1_sequence* > R1_all.fastq.gz &
cat *_2_sequence* > R2_all.fastq.gz &

Convert files

Using seqret from EMBOSS. If fresh install, don't forget "sudo ldconfig" to load libraries.

zcat R1_all.fastq.gz | seqret fastq-sanger::stdin fastq-illumina::stdout | gzip > R1_converted.fastq.gz
zcat R2_all.fastq.gz | seqret fastq-sanger::stdin fastq-illumina::stdout | gzip > R2_converted.fastq.gz