Maize v1 v2



These analyses compare the genomic sequence assemblies of maize B73 refgen versions 1 and 2. Maize was sequenced bac by bac, and bacs were chosen that tile across all of maize's chromosomes. This means that the relative order of most bacs was correctly determined between and within a chromosome. However, the sequences within a bac were often unordered, and the position of contig sequences within a bac relative to one another is not necessarily correct. Therefore version 1 of maize contained many localized misassemblies. Version 2 of maize aimed to correct many of these errors.
Please note that at the time of these analyses, no gene models or annotations were available for version 2 of the maize genome.
To determine the extent of these corrected errors, syntenic dotplots can be generated between two different versions of a genome. SynMap makes these comparisons easy to perform and provides a variety of visualization options to help identify assembly differences. Figure 1 shows a syntenic dotplot between maize genome assemblies refgen v1 and v2. In this dotplot, syntenic regions are given a colored dot (which form lines when the density is high). These dots are colored green and blue if they are in the same or opposite orientations respectively. There are two sets of sytnenic lines in this dotplot. The strong lines that mostly form continuous lines in the chromosome-v-chromosome grids running from the lower-left corner of to the upper-right corner, and several smaller regions with a lower density of dots. The latter regions are from the most recent whole genome duplication event in maize (for additional information on this please see the maize versus sorghum dotplot and splitting the maize genome into its two ancestral genomes.)
This dotplot reveals that the overall structure of these two assemblies is highly similar (for an example of comparing genome assemblies with many more differences, please see medicago version 1 versus version 2.) There is a large obvious inverted region on chromosome 3 (close-up Fig 2), and several breaks in the syntenous line showing areas where sequence was added or removed from the assembly. However, close examination shows many blue dots intermixed with green. These point to regions where a small inversion was made between the two version of maize assemblies. However, at this resolution, it is not possibly to identify small movements of assembled pieces.
High-resolution analysis of these regions can show the details of these inversion as well as changes in the arrangement of contigs. Figure 3 uses GEvo to analyze a 1MB region of chromosome three. Since maize contains many highly repetitive sequences, which will severely obfuscate the results of such pair-wise sequence analyses, maize version 1 has all non-CDS sequences masked (top panel). These masked sequences are denoted by a purple background. Since maize version 2 has no gene annotations, we have to use the entire sequence. While difficult to see in this image, unsequenced regions are denoted with an orange background, and usually represent breaks between contigs. As can be seen in Figure 3, several of the CDSs regions from version 1 of maize have been move relative to neighboring sequences as well as been inverted.
One thing to keep in mind is that SynMap and GEvo are linked together to make it relatively easy to move from whole genome to high-resolution sub-chromosome sequence analyses. SynMap produces whole genome and chromosome level syntenic dotplots, and the chromosome level syntenic dotplot is linked to GEvo by clicking on syntenic pairs in the dotplot. This integrated linking between CoGe's tools is part of its design to create an open-ended analysis network.