Difference between revisions of "Syntenic dotplot"

From CoGepedia
Jump to: navigation, search
 
(40 intermediate revisions by 3 users not shown)
Line 1: Line 1:
[[Image:Dotplot.png|thumb|center|983px]]  
+
[[Image:Dotplot.png|thumb|right|600px| Syntenic dotplot of E-coli B strain REL606(x-axis) and E-coli K12 strain DH10B (y-axis). The "green" line represents the regions of similarities between the two genomes while the discontinuities in this syntenic line (marked by numbered arrows) represent regions of genomic variations at a given locus between the two substrains of E-coli. Variations of this size (10s of kb) are usually the result of phage insertions, horizontal gene transfer events, deletions, and transposon activity.  More information about this comparison can be found [[Analysis of variations found in genomes of Escherichia coli strain K12 DH10B and strain B REL606 using SynMap and GEvo analysis | here]].  More examples of bacterial syntenic dotplots and [[x-alignments]] can be found [[x-alignments | here]]. This dotplot can be regenerated [http://genomevolution.org/CoGe/SynMap.pl?dsgid1=7454;dsgid2=4241;D=20;g=10;A=5;w=0;b=1;ft1=1;ft2=1;dt=geneorder here].]]
  
<br>
+
[[Image:Master 6807 8082.CDS-CDS.blastn geneorder D40 g20 A10.w1200.gene.ks.png|thumb|right|600px|Syntenic dotplot with Ks coloration of sorghum (x-axis) versus maize (y-axis). Genes are used for axis metrics; black lines separate chromosomes in each genome. Results can be regenerated at: https://genomevolution.org/r/dfjy.  Red syntenic lines are from the maize-specific [[whole genome duplication]] event and are orthologous to sorghum.  Purple are from the older pre-grass [[whole genome duplication]] event are are [[out-paralogs]].  More information about this analysis can be found [[Maize_Sorghum_Syntenic_dotplot | here]]. ]]
  
{| width="1000" cellspacing="1" cellpadding="1" border="1"
+
[[Image:Master 8154 8154.CDS-CDS.blastn.dag.go c4 D40 g20 A5.aligncoords.gcoords ct0.w2000.gene.ks.png|thumb|right|600px|Syntenic dotplot of poplar versus itself. Syntenic gene-pairs are colored by the [[synonymous mutation]] values. This reveals intragenomic synteny derived from a recent [[whole genome duplication]] event (dark blue) and the older [[eudicot paleohexaploidy]] event (green-cyan). This analysis can be regenerated at http://genomevolution.org/CoGe/SynMap.pl?dsgid1=8154;dsgid2=8154;c=4;D=40;g=20;A=5;Dm=;gm=;w=0;b=1;ft1=1;ft2=1;do1=1;do2=1;do=40;dt=geneorder;ks=1;am=g]]
|-
+
| Variation type<br>
+
| Difference in strain B REL606<br>
+
| Difference in strain K-12 DH10B<br>
+
| Evidence<br>
+
| Notes<br>
+
| Link leading to GEvo <br>
+
|-
+
| 1. Deletion<br>
+
| none<br>
+
| Deletion of ~18 genes including DNA <br>pol II, genes in metabolic pathway, thiamine ABC transporter<br><br>
+
| pseudogenes in DH10B at deletion site.<br><br>
+
| Possible additional insertion in DH10B as evidenced by <br>pseudogenes of yabP, RNA pol associated helicase and FruR, that are not present in REl606<br><br>
+
| [http://tinyurl.com/ylg9qrk tinyurl.com/yexrzpb]<br>
+
|-
+
| 2. Insertion<br>
+
| Insertion of IS1 transposon<br>
+
| Insertion sequences and Prophage CP46 DNA insertion
+
| Prophage specific genes found in DH10B<br>
+
| Prophage DNA insertion and IS insertions has created pseudogenes in K-12 DH10B<br>
+
| [http://tinyurl.com/yjdqgzr tinyurl.com/yd2quy7]<br>
+
|-
+
| 3.Insertion in REL606 and DH10B<br>
+
| Insertion of IS1 sequence. Insertion of ~15 genes including lac operon and other metabolic enzymes genes
+
| Insertion of IS3 and IS2 sequences
+
|
+
Possible insertion in REL606 as evidenced by inverted repeats<br>
+
  
| Pseudogenes of yaiT and yaiX were created in DH10B by transposon insertions<br>
+
[[Image:Master 8154 8154.CDS-CDS.blastn.dag.go c4 D40 g20 A5.aligncoords.gcoords ct0.w2000.gene.ks.hist.png|thumb|right|600px|Histogram of the [[synonymous mutation]] (Ks) values (log 10 transformed) of the syntenic gene pairs within poplar. Smaller values on left infers young gene pairs, and larger values on right infers older gene pairs. The two middle peaks are from poplar's recent whole genome duplication event (blue) and a more ancient [[eudicot paleohexaploidy]] event (green-cyan). The peak on the far right, with non-log10 transformed Ks values of 50-100 are noise in the analysis. Perhaps from the alignment of pseudogenes, mis-called syntenic gene pairs, and erroneous gene models. These colors correspond to the colors used in the syntenic dotplot shown above.]]
| [http://tinyurl.com/yldc83u http://tinyurl.com/yldc83u]<br>
+
|-
+
| 4. Insertion in REL606 and DNA duplication event in DH10B. <br>
+
| Prophage DNA and transposase insertion <br>
+
| Recent DNA duplication event&nbsp;&nbsp;
+
| 100% identity between paralogs and ~98% identity between syntenic region in REL606<br>
+
| Possible phage DNA insertion in REL606 as "hypothetical protein"&nbsp;genes&nbsp;were found near putative prophage tail component gene in REL606. <br>
+
| [http://tinyurl.com/yk7vjgq tinyurl.com/yea8bu6]<br>
+
|-
+
| 5. Insertion<br>
+
| Bacteriophage DNA insertion <br>
+
| IS2 sequence insertion<br>
+
| Pseudogenes at IS2 insertion site in DH10B. Phage specific genes were found in REL606<br>
+
| Possible phage DNA insertion in REL606 as "Hypothetical proteins" were found near phage specific genes <br>
+
| [http://tinyurl.com/yevlb2w tinyurl.com/yevlb2w]<br>
+
|-
+
| 6. Insertion<br>
+
| Prophage DNA insertion <br>
+
| none<br>
+
| Phage specific genes were found in REL606<br>
+
| none<br>
+
| [http://tinyurl.com/ybokuag tinyurl.com/ybokuag]<br>
+
|-
+
| 7. Insertion<br>
+
| none
+
| Prophage DNA insertion
+
| Phage specific genes found in DH10B<br>
+
| none<br>
+
| [http://tinyurl.com/yaxlh7o tinyurl.com/yaxlh7o]<br>
+
|-
+
| 8. Insertion and deletion<br>
+
| Transposon insertions and deletion of&nbsp;phenylacetic acid degradation genes <br>
+
| IS and Rac prophage DNA insertion
+
| Phage specific genes found in DH10B. IS insertions in REL606 might have created direct repeats and facilitated excision of phenylacetic acid degradation genes.<br>
+
| Rac prophage DNA disrupted by transposon insertion in DH10B
+
| [http://tinyurl.com/yccbmsq tinyurl.com/yccbmsq]<br>
+
|-
+
| 9a. Insertion
+
| 9a. none<span style="white-space: pre;" class="Apple-tab-span"> </span>
+
| 9a. Insertion of IS5 sequence
+
| 9a. none
+
| 9a. none
+
| 9a.&nbsp;[http://tinyurl.com/ylllc6u http://tinyurl.com/ylllc6u]
+
|-
+
| 9b. Insertion
+
| 9b. Insertion of ISI transposon
+
| 9b. none
+
| 9b. none
+
| 9b. none
+
| 9b.&nbsp;[http://tinyurl.com/ygsqg2f tinyurl.com/ygsqg2f]
+
|-
+
| 9c. Insertion
+
| 9c. none
+
| 9c. Insertion of ABC transporter, flagella encoding genes and few other enzymes
+
| 9c. Inserted DNA segment in DH10B is bordered by direct repeats at both ends. 100% identity was found between the two repeats. <br>
+
| 9c.DR indicates transposon insertion in DH10B.&nbsp;
+
| 9c.&nbsp;[http://tinyurl.com/ykypc67 tinyurl.com/ykypc67]
+
|-
+
| 9d. Insertion
+
| 9d. IS2 insertion <br>
+
| 9d. none
+
| 9d. none
+
| 9d. none
+
| 9d.&nbsp;[http://tinyurl.com/ygfgtqy tinyurl.com/ygfgtqy]
+
|-
+
| 10. Insertion and deletion<br><br>
+
| Bacteriophage DNA insertion and IS1 transposon insertion. Deletion of metal resistance genes<br><br>
+
| Insertion of pili associated genes
+
| Insertion of pili associated genes in DH10B as evidenced by their different GC codon
+
| No IR were found as to indicate insertion of genes conferring metal resistance in DH10B. Possible deletion in REL606 of these genes prior to IS1 transposon insertion. &nbsp;
+
|
+
[http://tinyurl.com/yammj7v http://tinyurl.com/yammj7v]  
+
  
[http://tinyurl.com/ykvvvk2 tinyurl.com/ykvvvk2]
+
'''Syntenic dotplots''' are a type of scatter-plot. Each axis represents a sequence laid end-to-end, and each dot in the scatter-plot represents a putative [[homologous]] match between the two sequences. Often, these dotplots are used for whole genome comparisons within the same genome or across two genomes from different taxa in order to identify [[synteny]]. Synteny is defined as two or more genomic regions that are derived from a common ancestral genomic region. The evidence for synteny is the identification of a set of homologous genes in two genome that have a collinear arrangement. When such a pattern of gene-order conservation is discovered, the most parsimonious explanation is that the two regions are related through a common ancestor. While syntenic dotplots are useful for identifying related genomic regions, they are also useful for identifying genomic regions that have undergone an evolutionary change in one of the two genomes being compared. Example of such events are:
 
+
*[[insertions]]
|-
+
*[[horizontal gene transfers]]
| 11. Insertion&nbsp;<br>
+
*[[deletions]]
| IS1 insertion followed by insertion of ParB family protein genes and recombinase.<br>
+
*duplications
| CP4-57 prophage DNA insertion<br>
+
*[[inversions]]  
| Different GC content of ParB family protein and recombinase gene indicates that these genes were acquired by exogenous DNA such as plasmid possibly by homologous crossover as no IR were found.&nbsp;
+
| IS1 insertion probably occured independent of integration of other genes in REL606.
+
CoGe's tool [[SynMap]] makes it easy to create a syntenic dotplot for any two genomes in CoGe.
| [http://tinyurl.com/ycdog2c http://tinyurl.com/ycdog2c]<br>
+
|-
+
| 12. Insertion, deletion and Inversion<br>
+
| IS1 insertion. Insertion of pyrophosphorylase and "hypothetical protein" genes
+
| IS5 and IS10 transposon insertion. Inversion of ornithine decarboxylase, M-type protein and bifunctional prepilin peptidase/methylase. Deletion of saframycin synthetase, capsule related genes, bio-film formation genes, anti-toxin system and type II secretory apparatus genes.&nbsp;
+
|
+
Insertion of pyrophosphorylase and "hypthetical protein' genes in REL606 as evidenced by their different GC content
+
 
+
Inversion in DH10B as evidenced by IR of IS10 transposon.  
+
 
+
Deletion of several genes in DH10B is evidenced by IS5 transposase transactivator.&nbsp;
+
 
+
|
+
Insertion of IS5 trans-activator transposase indicates possible deletion of several genes in DH10B. Also, no evidence of insertion in REL606 was found such as different GC content or DR
+
 
+
| [http://tinyurl.com/ycoagxh http://tinyurl.com/ycoagxh]<br>
+
|-
+
|
+
13a. Deletion
+
 
+
| 13a. none
+
|
+
13a. Deletion of putative adhesin<br>
+
 
+
| 13a. Insertion of HEAT repeat containing lyase at the site of deletion in DH10B may have created pseudogene of adhensin which later got deleted
+
| 13a.Insertion of HEAT repeat containing lyase is evidenced by its different GC content. Insertion may have created pseudogene of adhesin which later got deleted,in REL606 as evidenced by different GC content of capsule related protein. <br>
+
| 13a.[http://tinyurl.com/yjojy53 &nbsp;tinyurl.com/yjojy53]
+
|-
+
|
+
13b, Insertion<br>
+
 
+
| 13b. Insertion of IS1, waaL and waaV genes
+
|
+
13b. Insertion of ~5 rfa genes.<br>
+
 
+
|
+
<br>
+
13b. Insertion in both is evideneced by different GC content of the genes.
+
| 13b. IS1 insertion created pseudogene.
+
| 13b.[http://tinyurl.com/yj2yg5s http://tinyurl.com/yj2yg5s]
+
|-
+
|
+
13c. Insertion
+
 
+
and deletion<br>
+
 
+
|
+
13c. Insertion of IS30 transposon and several 'hypothetical protein" genes.&nbsp;
+
 
+
|
+
13c. Deletion of ShiA-like and TrbC-like protein genes<br>
+
 
+
|
+
13c. Insertion in REL606 is evidenced by different GC content of genes. <br>
+
 
+
<br>
+
 
+
|
+
<br>
+
 
+
13c. Pseudogene in REL606 and different GC content of genes indicate possible insertion. No DRs were found as to indicate insertion of ShiA-like and TrbC-like gene therefore deletion of these genes in DH10B may have occured<br>
+
 
+
|
+
<br>
+
 
+
<br>
+
 
+
13c.[http://tinyurl.com/yjzdyum &nbsp;tinyurl.com/yjzdyum]<br>
+
 
+
[http://tinyurl.com/ydkrcv8 <br>]  
+
 
+
|-
+
|
+
14a. Insertion
+
 
+
| 14a. Inserion of several transposons and secondary glycine betaine transporter
+
| 14a.Insertion of several transposons. Insertion of Kple2 phage-like element
+
| 14a. Insertion of transposons has integreted genes of different GC contents,
+
| 14a. none
+
| 14a. [http://tinyurl.com/yzyvunx tinyurl.com/yzyvunx]
+
|-
+
| 14b. Insertion
+
| 14b. Insertion of ~15 genes
+
| 14b. none
+
| 14b. Insertion in REL606 is evidenced by IR flanking the DNA segment containing several genes.
+
| 14b.Pseudogene created at the site of insertion in DH10B.
+
| 14b. [http://tinyurl.com/yly2b6u tinyurl.com/yly2b6u]
+
|-
+
|
+
 
+
 
+
14c. Deletion
+
 
+
|
+
14c. none
+
 
+
|
+
<br>
+
 
+
14c. Deletion of ~15 genes.
+
 
+
<br>
+
 
+
<br>
+
 
+
|
+
<br>
+
 
+
14c. Deletion in DH10B is evidenced by insertion of IS10R which&nbsp; may have facilitated excision of DNA by forming DRs
+
 
+
|
+
<br>
+
 
+
<br>
+
 
+
14c. none
+
 
+
|
+
<br>
+
 
+
<br>
+
 
+
14c. [http://tinyurl.com/yfhhsk6 tinyurl.com/yfhhsk6]
+
 
+
[http://tinyurl.com/ycwsmsl <br>]
+
 
+
|}
+
 
+
<br>
+

Latest revision as of 15:02, 24 July 2014

Syntenic dotplot of E-coli B strain REL606(x-axis) and E-coli K12 strain DH10B (y-axis). The "green" line represents the regions of similarities between the two genomes while the discontinuities in this syntenic line (marked by numbered arrows) represent regions of genomic variations at a given locus between the two substrains of E-coli. Variations of this size (10s of kb) are usually the result of phage insertions, horizontal gene transfer events, deletions, and transposon activity. More information about this comparison can be found here. More examples of bacterial syntenic dotplots and x-alignments can be found here. This dotplot can be regenerated here.
Syntenic dotplot with Ks coloration of sorghum (x-axis) versus maize (y-axis). Genes are used for axis metrics; black lines separate chromosomes in each genome. Results can be regenerated at: https://genomevolution.org/r/dfjy. Red syntenic lines are from the maize-specific whole genome duplication event and are orthologous to sorghum. Purple are from the older pre-grass whole genome duplication event are are out-paralogs. More information about this analysis can be found here.
Syntenic dotplot of poplar versus itself. Syntenic gene-pairs are colored by the synonymous mutation values. This reveals intragenomic synteny derived from a recent whole genome duplication event (dark blue) and the older eudicot paleohexaploidy event (green-cyan). This analysis can be regenerated at http://genomevolution.org/CoGe/SynMap.pl?dsgid1=8154;dsgid2=8154;c=4;D=40;g=20;A=5;Dm=;gm=;w=0;b=1;ft1=1;ft2=1;do1=1;do2=1;do=40;dt=geneorder;ks=1;am=g
Histogram of the synonymous mutation (Ks) values (log 10 transformed) of the syntenic gene pairs within poplar. Smaller values on left infers young gene pairs, and larger values on right infers older gene pairs. The two middle peaks are from poplar's recent whole genome duplication event (blue) and a more ancient eudicot paleohexaploidy event (green-cyan). The peak on the far right, with non-log10 transformed Ks values of 50-100 are noise in the analysis. Perhaps from the alignment of pseudogenes, mis-called syntenic gene pairs, and erroneous gene models. These colors correspond to the colors used in the syntenic dotplot shown above.

Syntenic dotplots are a type of scatter-plot. Each axis represents a sequence laid end-to-end, and each dot in the scatter-plot represents a putative homologous match between the two sequences. Often, these dotplots are used for whole genome comparisons within the same genome or across two genomes from different taxa in order to identify synteny. Synteny is defined as two or more genomic regions that are derived from a common ancestral genomic region. The evidence for synteny is the identification of a set of homologous genes in two genome that have a collinear arrangement. When such a pattern of gene-order conservation is discovered, the most parsimonious explanation is that the two regions are related through a common ancestor. While syntenic dotplots are useful for identifying related genomic regions, they are also useful for identifying genomic regions that have undergone an evolutionary change in one of the two genomes being compared. Example of such events are:

CoGe's tool SynMap makes it easy to create a syntenic dotplot for any two genomes in CoGe.