Difference between revisions of "Creosote"

From CoGepedia
Jump to: navigation, search
 
(18 intermediate revisions by the same user not shown)
Line 1: Line 1:
Creosote genome sequencing and assembly notes:
+
==Twig2Genome Notes==
 +
[[Twig2Genome]]
  
*Sample obtained from front yard of 4951 W. McElroy Dr.
+
==Assembly==
*Sequences obtained from one lane of Illumina HiSeq2000
+
[[Creosote Assembly]]
*Fastq files delivered from UAGC
+
 
**82 files
+
==Loading into CoGe==
***lane3_NoIndex_L003_R1_041.fastq
+
SoapDeNovo assemly: 1,570,116 contigs (370MB): http://genomevolution.org/CoGe/OrganismView.pl?dsgid=12183 ('''Note:'''  too many contigs to process and visualize in SynMap)
***lane3_NoIndex_L003_R2_041.fastq
+
 
**Need to understand if these are paired-end reads
+
SoapDeNovo assembly of contigs >= 2000nt: 6,976 contigs (20MB): http://genomevolution.org/CoGe/OrganismView.pl?dsgid=12185
**Need to get adapter sequences used in sequencing
+
 
**Description of Fastq file format with notes on specific decoding of header names generated by various technologies: http://en.wikipedia.org/wiki/FASTQ_format
+
ABySS assembly (bpsize=64, 2kb minimum contig size): 122,972 contigs (392MB): http://genomevolution.org/CoGe/OrganismView.pl?dsgid=12275
*Check quality with fastqc: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/
+
*Syntenic Path Assembly to peach: http://genomevolution.org/r/3xmq
**[[Creosote First Run FastQC]]
+
 
*Sequences cleaned using trimReads by Haibao Tang: https://github.com/tanghaibao/trimReads/tree/
+
Velvet assembly 515,190 contigs (241MB): http://genomevolution.org/CoGe/OrganismView.pl?dsgid=12245
**Ran with supplied adapter sequence file:
+
 
>Illumina_PE-1
+
CLC4 assembly 685,475 contigs (508MB): http://genomevolution.org/CoGe/OrganismView.pl?dsgid=12244
AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
+
*Syntenic Path Assembly to peach: http://genomevolution.org/r/3xpd
>Illumina_PE-2
+
 
CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT
+
==[[Syntenic Path Assembly]]==
>Illumina_PE-1rc
+
 
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT
+
[[File:Master 12185 8400.genomic-CDS.lastz.dag.go c20 D20 g10 A2.aligncoords.gcoords ct0.w1000.ass2.cs1.csoS.nsd.png|thumb|600px|left|Syntenic path assembly with SynMap of creosote (x-axis) and peach (y-axis). Results may be regenerated at: http://genomevolution.org/r/3w95]]
>Illumina_PE-2rc
+
 
AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTG
+
[[File:Master 11022 12185.CDS-genomic.lastz.dag.go c20 D20 g10 A2.aligncoords.gcoords ct0.w1000.ass2.cs1.csoS.nsd.png|thumb|600px|left|Syntenic path assembly with SynMap of creosote (y-axis) and Arabidopsis thaliana Col-0 (x-axis). Results may be regenerated at: http://genomevolution.org/r/3w96]]
**Command-line run:
+
 
  /home/elyons/bin/trimReads  -Q 33 -f /home/elyons/projects/genome/data/creosote/src/adapters.fasta ./lane3_NoIndex_L003_R1_033.fastq
+
==Pseudo-Assembly==
**Output of trimReads:
+
[[File:Pseudo-assembly-creosote.png|thumb|600px|left|Pseudo-assembly of creosote (x-axis) using the peach (y-axis) genomeSyntenic comparison to the peach genome.]]
[0] Illumina_PE-1 found 54 times
+
 
  [1] Illumina_PE-2 found 3 times
+
[[File:Screen Shot 2014-05-14 at 8.33.27 AM.png|thumb|600px|left|Microsynteny analysis of the pseudo-assembled creosote genome to the peach genomeOrange bars are unsequenced Ns that represent contigs glued together in creosote by the [[syntenic path assembly]] method. Note the concordance of gene model coding sequences.]]
[2] Illumina_PE-1rc found 2850 times
+
[3] Illumina_PE-2rc found 12 times
+
+
A total of 92003 too short (trimmed length < 30) reads removed.
+
  A total of 949092 trimmed reads are written to `./lane3_NoIndex_L003_R2_041.trimmed.fastq`.
+
  Processed 1041095 sequences took 1557.84 seconds.
+
***Appears to not have the correct linkers as I would assume to see more removed
+

Latest revision as of 07:35, 14 May 2014

Twig2Genome Notes

Twig2Genome

Assembly

Creosote Assembly

Loading into CoGe

SoapDeNovo assemly: 1,570,116 contigs (370MB): http://genomevolution.org/CoGe/OrganismView.pl?dsgid=12183 (Note: too many contigs to process and visualize in SynMap)

SoapDeNovo assembly of contigs >= 2000nt: 6,976 contigs (20MB): http://genomevolution.org/CoGe/OrganismView.pl?dsgid=12185

ABySS assembly (bpsize=64, 2kb minimum contig size): 122,972 contigs (392MB): http://genomevolution.org/CoGe/OrganismView.pl?dsgid=12275

Velvet assembly 515,190 contigs (241MB): http://genomevolution.org/CoGe/OrganismView.pl?dsgid=12245

CLC4 assembly 685,475 contigs (508MB): http://genomevolution.org/CoGe/OrganismView.pl?dsgid=12244

Syntenic Path Assembly

Syntenic path assembly with SynMap of creosote (x-axis) and peach (y-axis). Results may be regenerated at: http://genomevolution.org/r/3w95
Syntenic path assembly with SynMap of creosote (y-axis) and Arabidopsis thaliana Col-0 (x-axis). Results may be regenerated at: http://genomevolution.org/r/3w96

Pseudo-Assembly

Pseudo-assembly of creosote (x-axis) using the peach (y-axis) genome. Syntenic comparison to the peach genome.
Microsynteny analysis of the pseudo-assembled creosote genome to the peach genome. Orange bars are unsequenced Ns that represent contigs glued together in creosote by the syntenic path assembly method. Note the concordance of gene model coding sequences.