(13 intermediate revisions by the same user not shown)
Line 1:
Line 1:
Creosote genome sequencing and assembly notes:
==Twig2Genome Notes==
[[Twig2Genome]]
*Sample obtained from front yard of 4951 W. McElroy Dr.
==Assembly==
*Sequences obtained from one lane of Illumina HiSeq2000
[[Creosote Assembly]]
*Fastq files delivered from UAGC
**82 files
***lane3_NoIndex_L003_R1_041.fastq
***lane3_NoIndex_L003_R2_041.fastq
**Need to understand if these are paired-end reads
**Need to get adapter sequences used in sequencing
**Description of Fastq file format with notes on specific decoding of header names generated by various technologies: http://en.wikipedia.org/wiki/FASTQ_format
*Check quality with fastqc: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/
**[[Creosote First Run FastQC]]
*Sequences cleaned using trimReads by Haibao Tang: https://github.com/tanghaibao/trimReads/tree/
==Loading into CoGe==
**NOte: Only use on single reads
SoapDeNovo assemly: 1,570,116 contigs (370MB): http://genomevolution.org/CoGe/OrganismView.pl?dsgid=12183 ('''Note:''' too many contigs to process and visualize in SynMap)
**Note: if SOAP crashes, try another XXmer binary (e.g. 63mer)
'''Running Velvet'''
[[File:Master 12185 8400.genomic-CDS.lastz.dag.go c20 D20 g10 A2.aligncoords.gcoords ct0.w1000.ass2.cs1.csoS.nsd.png|thumb|600px|left|Syntenic path assembly with SynMap of creosote (x-axis) and peach (y-axis). Results may be regenerated at: http://genomevolution.org/r/3w95]]
OMP_NUM_THREADS=32 velvetg VelvetAssem -scaffolding yes -exp_cov auto -cov_cutoff auto -min_contig_lgth 200 -ins_length 150
[[File:Master 11022 12185.CDS-genomic.lastz.dag.go c20 D20 g10 A2.aligncoords.gcoords ct0.w1000.ass2.cs1.csoS.nsd.png|thumb|600px|left|Syntenic path assembly with SynMap of creosote (y-axis) and Arabidopsis thaliana Col-0 (x-axis). Results may be regenerated at: http://genomevolution.org/r/3w96]]
==Pseudo-Assembly==
[[File:Pseudo-assembly-creosote.png|thumb|600px|left|Pseudo-assembly of creosote (x-axis) using the peach (y-axis) genome. Syntenic comparison to the peach genome.]]
[[File:Screen Shot 2014-05-14 at 8.33.27 AM.png|thumb|600px|left|Microsynteny analysis of the pseudo-assembled creosote genome to the peach genome. Orange bars are unsequenced Ns that represent contigs glued together in creosote by the [[syntenic path assembly]] method. Note the concordance of gene model coding sequences.]]
'''Other Stuff'''
*python -m jcvi.formats.fastq convert (read help file, default converstion
*python -m jcvi.apps.baseclean trim fastqfile (single ended)
*python -m jcvi.apps.baseclean trim R1.fastq.gz R2.fastq.gz (paired ended)
Syntenic path assembly with SynMap of creosote (x-axis) and peach (y-axis). Results may be regenerated at: http://genomevolution.org/r/3w95Syntenic path assembly with SynMap of creosote (y-axis) and Arabidopsis thaliana Col-0 (x-axis). Results may be regenerated at: http://genomevolution.org/r/3w96
Pseudo-Assembly
Pseudo-assembly of creosote (x-axis) using the peach (y-axis) genome. Syntenic comparison to the peach genome.Microsynteny analysis of the pseudo-assembled creosote genome to the peach genome. Orange bars are unsequenced Ns that represent contigs glued together in creosote by the syntenic path assembly method. Note the concordance of gene model coding sequences.