Difference between revisions of "Analysis of differences found between Escherichia coli strain K12 DH10B and strain B REL606 using SynMap and GEvo analysis"

From CoGepedia
Jump to: navigation, search
 
(13 intermediate revisions by the same user not shown)
Line 1: Line 1:
In this exercise you will compare the genomes of two Escherichia coli strains, K12 DH10B and B REL606 using SynMap and GEvo analysis. In addition, we will observe the differences between these two genomes as a result of lineage divergence of the two E-coli strains. The computational tools used to do this analysis can be used for comparing genomes of any species. In two closely related bacterial genomes, for instance, several differences could be found such as transposition, insertions, deletions, duplications, inversion and translocations.
+
In this exercise you will compare the genomes of two ''Escherichia coli'' strains, K12 DH10B and B REL606 using SynMap and GEvo analysis. In addition, we will observe the differences between these two genomes as a result of lineage divergence of the two ''E-coli'' strains. The computational tools used to do this analysis can be used for comparing genomes of any species. In two closely related bacterial genomes, for instance, several differences could be found such as transposition, insertions, deletions, duplications, inversion and translocations.  
  
 +
<br> First, we need to construct a syntenic dotplot of K12 DH10B and B REL606 using SynMap. Go to [http://www.synteny.cnr.berkeley.edu/CoGe/Synmap.pl SynMap] Search for ''E-coli'' strain K12 DH10B and ''E-coli'' strain B REL606 in the database of Organism 1 and Organism 2 respectively. Click "Generate SynMap". This program will lay the two genomes on the axes and indicate regions of similarities between the two as green dots on a syntenic dotplot. The collection of these dots appear as green line. [[Image:Generate synmap.png|thumb|center|700px]] [[Image:Dotplot.png|thumb|center|700px]]
  
First, we need to construct a syntenic dotplot of K12 DH10B and B REL606 using SynMap. Go to [http://www.synteny.cnr.berkeley.edu/CoGe/Synmap.pl] Search for E-coli K12 strain DH10B and E-coli B strain REL606 in the database of Organism 1 and Organism 2 respectively. Click on "generate synmap". This program will lay the two genomes on the axes and indicate regions of similarities between the two as green dots on a syntenic dotplot. The collection of these dots appear as green line. [[Image:generate_synmap.png|frame|center|100px|Webpage of SynMap]]
+
<br> In order to accurately account for the "breaks" or discontinuities in the dotplot, we need to run GEvo analysis. GEvo uses multiple algorithms to run comparisons between the two genomic regions. More information on GEvo software tool can be found at: [[GEvo]]. The discontinuities in this syntenic dotplot represent the sites of insertions or deletions. To analyse each of these "breaks" in dotplot, use the locator of the dotplot to click on the green spot right before a "break". The locator will turn "red" once you have placed it on basepairs/green dots.  
[[Image:dotplot.png|frame|center|100px|Syntenic dotplot of strain B REL606 vs strain K12 DH10B]]
+
  
 +
After clicking, a new page of GEvo will appear displaying the sequence information corresponding to our region of interest in the dotplot. Click "Run GEvo Analysis!". This will allow you to visualize and compare the genetic make-up of strain B REL606 and strain K DH10B at a given region. In this case, our region of interest corresponds to a discontunity in our dotplot.[[Image:GEvo.png|thumb|center|700px]]
  
In order to accurately account for the "breaks" or discontinuties in the dotplot, we need to run GEvo analysis. GEvo uses multiple algorithms to run comparisons between the two genomic regions. These discontinuties in a dotplot represent the sites of insertions or deletions. More information on GEvo software tool can be found at: .On the dotplot diagram, use the locator to click on the green spot right before a "break". The locator will turn "red" once you have placed it on basepairs/green dots.
+
<br> Once GEvo analysis appears, we can begin to look for differences between the two genomes. Click on individual genes/green bars and its annotation will appear in a box. Repeat the above mentioned steps for each discontinuity in dotplot for individual analysis. [[Image:gene annotation1.png|thumb|center|700px]]. Notice the fragments of green line all over the dotplot. These represent translocation events in the ''E-coli'' strains we are examining. Place the locator on these and run GEvo to determine which genes were translocated.  
  
After clicking, a new page of GEvo (web add) will appear. Click "Run GEvo Analysis!". This will allow you to visualize and compare the genetic make-up of strain B REL606 and strain K DH10B at a given region. In this case, our region of interest corresponds to a discontunity in our dotplot.
+
<br> Notice the pink bars over the DNA segments. Click on these and it will connect to its syntenic region. A sliding window at the sides of the diagram can be used to magnify a region and enables us to view these genes at a higher resolution. Notice the edges of pink bars connecting syntenic regions. They may run parallel or cross each other. The latter represents an inversion event. [[Image:sliding window1.png|thumb|center|700px]] [[Image:inversion1.png|thumb|center|700px]]
[[Image:GEvo.png|frame|center|100px|Webpage of GEvo]]
+
  
 +
<br> Evidence for deletion and insertions can also be found on these genomes using GEvo. At several instants, you will find that the "breaks" in our dotplot corresponds to transposition. Several deletions and insertion events could be explained by transposon activity in these genomes. You can also distinguish the DNA segments containing different GC content relative to other parts of genomes. Under GEvo Configuration, click "Results Parameters" and select "Yes" for "Color wobble codon GC content". Click "Run GEvo Analysis!". The region containing different GC content relative to the rest of genome will appear red. [[Image:GC content1.png|thumb|center|700px]]
  
Once GEvo analysis appears, we can begin to look for differences between the two genomes. Click on individual genes/green bars and its annotation will appear in a box.
+
<br> Beware of the of the genes that may not seem syntenic (missing pink bars) at first. For locating the potential syntenic regions, change the sequence number on either genomes. Under GEvo Configuration, increase/decrease the number of sequences on left/right. Then click "Run GEvo Analysis!". [[Image:sequence-1.png|thumb|center|700px]]
Repeat the above mentioned steps for each discontinuity in dotplot for individual analysis. The translocation events can be observed from dotplot as well. Notice the fragments of green line all over the dotplot. Place the locator on these and run GEvo to determine which genes were translocated.
+
  
 
+
<br> The DNA segments can also be aligned against each other. This is particularly helpful when locating regions of direct repeats, inverted repeats and determining percent identities between paralogs and orthologs. To align multiple sequences simultaneously, click "+ Add Sequence", copy and paste the name/ID of the organism in the newly created box for additional sequences. Click "Run GEvo Analysis!". The resulting analysis will be color-coded distantly. Click on color-coded bars to find syntenic regions on each sequences. [[Image:addseq.png|thumb|center|700px]]
Click on the pink bars above a DNA segment and it will connect to its syntenic region. A sliding window can be used to magnify a region we want to analyze. Notice the edges of pink bars connecting syntenic regions. They may run parallel or cross each other. The latter represents an inversion event.
+
[[Image:analysis.png|thumb|center|700px]]
 
+
Detailed analysis of this syntenic dotplot can be found at [[Syntenic dotplot]]
 
+
Evidence for deletion and insertions can be located on these genomes using GEvo. At several instants, transposon insertions will account for deletions/insertions in bacterial genomes.
+
 
+
You can also distinguish the segment containing different GC content relative to other parts of genomes. Under GEvo Configuration, click "Results Parameters" and select "Yes" for "Color wobble codon GC content". Click "Run GEvo Analysis!". The region containing different GC content will appear red.
+
 
+
 
+
Beware of the of the genes that may not seem syntenic at first. For locating the potential syntenic regions, change the sequence number on either genomes. Under GEvo Configuration, increase/decrease the number of sequences on left/right. Then click "Run GEvo Analysis!"
+
 
+
 
+
The DNA segments can also be aligned against each other. This is particularly helpful when locating regions of direct repeats, inverted repeats and determining percent identities between paralogs. To align multiple sequences simultaneously, click "+ Add Sequence", copy and paste the name/ID of the organism in the newly created box for additional sequences. Click "Run GEvo Analysis!". The resulting analysis will be color-coded distantly. Click on those to find syntenic regions on each sequences.
+

Latest revision as of 19:37, 22 October 2009

In this exercise you will compare the genomes of two Escherichia coli strains, K12 DH10B and B REL606 using SynMap and GEvo analysis. In addition, we will observe the differences between these two genomes as a result of lineage divergence of the two E-coli strains. The computational tools used to do this analysis can be used for comparing genomes of any species. In two closely related bacterial genomes, for instance, several differences could be found such as transposition, insertions, deletions, duplications, inversion and translocations.


First, we need to construct a syntenic dotplot of K12 DH10B and B REL606 using SynMap. Go to SynMap Search for E-coli strain K12 DH10B and E-coli strain B REL606 in the database of Organism 1 and Organism 2 respectively. Click "Generate SynMap". This program will lay the two genomes on the axes and indicate regions of similarities between the two as green dots on a syntenic dotplot. The collection of these dots appear as green line.
Generate synmap.png
Dotplot.png


In order to accurately account for the "breaks" or discontinuities in the dotplot, we need to run GEvo analysis. GEvo uses multiple algorithms to run comparisons between the two genomic regions. More information on GEvo software tool can be found at: GEvo. The discontinuities in this syntenic dotplot represent the sites of insertions or deletions. To analyse each of these "breaks" in dotplot, use the locator of the dotplot to click on the green spot right before a "break". The locator will turn "red" once you have placed it on basepairs/green dots.

After clicking, a new page of GEvo will appear displaying the sequence information corresponding to our region of interest in the dotplot. Click "Run GEvo Analysis!". This will allow you to visualize and compare the genetic make-up of strain B REL606 and strain K DH10B at a given region. In this case, our region of interest corresponds to a discontunity in our dotplot.
GEvo.png

Once GEvo analysis appears, we can begin to look for differences between the two genomes. Click on individual genes/green bars and its annotation will appear in a box. Repeat the above mentioned steps for each discontinuity in dotplot for individual analysis.
Gene annotation1.png
. Notice the fragments of green line all over the dotplot. These represent translocation events in the E-coli strains we are examining. Place the locator on these and run GEvo to determine which genes were translocated.
Notice the pink bars over the DNA segments. Click on these and it will connect to its syntenic region. A sliding window at the sides of the diagram can be used to magnify a region and enables us to view these genes at a higher resolution. Notice the edges of pink bars connecting syntenic regions. They may run parallel or cross each other. The latter represents an inversion event.
Sliding window1.png
Inversion1.png

Evidence for deletion and insertions can also be found on these genomes using GEvo. At several instants, you will find that the "breaks" in our dotplot corresponds to transposition. Several deletions and insertion events could be explained by transposon activity in these genomes. You can also distinguish the DNA segments containing different GC content relative to other parts of genomes. Under GEvo Configuration, click "Results Parameters" and select "Yes" for "Color wobble codon GC content". Click "Run GEvo Analysis!". The region containing different GC content relative to the rest of genome will appear red.
GC content1.png

Beware of the of the genes that may not seem syntenic (missing pink bars) at first. For locating the potential syntenic regions, change the sequence number on either genomes. Under GEvo Configuration, increase/decrease the number of sequences on left/right. Then click "Run GEvo Analysis!".
Sequence-1.png

The DNA segments can also be aligned against each other. This is particularly helpful when locating regions of direct repeats, inverted repeats and determining percent identities between paralogs and orthologs. To align multiple sequences simultaneously, click "+ Add Sequence", copy and paste the name/ID of the organism in the newly created box for additional sequences. Click "Run GEvo Analysis!". The resulting analysis will be color-coded distantly. Click on color-coded bars to find syntenic regions on each sequences.
Addseq.png
Analysis.png

Detailed analysis of this syntenic dotplot can be found at Syntenic dotplot