How do I do an N-way comparison (where N is greater than 2)?
Simple! GEvo allows you to add more sequences to your analysis by pressing the "add" button. When you press the "add" button, a new sequence submission section is added to the web-page (and click "remove" to remove a sequence submission section):
How do I interpret a three way comparison?
Understanding the results of a three way comparison is no different than a two way comparison, except that there are three pair-wise comparisons to consider instead of only one. In the above example, there are three genomic regions. Just as with the two-way comparison, blast HSPs are drawn as colored boxes where each color and number indicates which the pair of regions that make an HSP. Since we have three sequences, there are three pair-wise bl2seq comparisons to generate. (Although this can be modified by specifying which sequences are "reference sequence".) And each pair-wise comparison gets its own color for its HSPs.
Now, when we look at the above example, we can see some very interesting patterns. But to help highlight them, let's use Gobe's HSP line drawing feature:
Then, specifying sequences for your analysis is just the same as before (see "Sequence Submission"). To run an analysis, specify the alignment algorithm and other options, and press "Go":
Here, we are connecting the HSPs between the top two genomic regions. Here we can see that there are three HSPs, each overlapping with a gene in both regions, that share are colinear. As talked about here, this is evidence that these regions are syntenic. When we doing a similar visual comparison with the top sequence and bottom sequence, we see the same pattern, but with more genes represented from the top sequence:
And, when we do the same analysis with the middle and bottom genomic regions, we see the same pattern (albeit with an inverted translocation (HSP 19) and an inversion (HSPs 20 and 21)):
Neat. However what does this mean evolutionarily?
Good question, however before we can answer it, you need to know a bit about the genomic regions being compared. The top two genomic regions are from Arabidopsis thaliana and are thought to have been derived from its most recent tetraploidy event (which is estimated to have occurred 50-70 MYA). The bottom is a sequence from Carica papaya (papaya). These plants are in the same order (Brassicales). Although there hasn't been an estimate (at least to the author's knowledge) of the divergence date of these plants, it predates the most recent genome duplication in Arabidopsis. However, it is known that since their time of divergence, papaya has not had any genome-wide duplication events.
If we look at all the regions of similarity of these regions at once, we see an interesting pattern:
The single region of papaya contains nearly all the gene content of the two Arabidopsis regions, event though the two Arabidopsis regions don't share a similar number of genes. This supports the idea that papaya can be used as a surrogate pre-duplicate ancestral genome for Arabidopsis because it contains nearly all the gene content of these two Arabidopsis regions combined. With this pre-duplicate genome, we can then study how the Arabidopsis genome has evolved following its genome duplication event. From this image, we can support the model that following genome duplication, the majority of duplicated genes were lost from one homeolog (intra-specific duplicated chromosomal region) or the other, while a small fraction of the genes were retained in duplicate. In other words, following genome duplication, there is fractionation of gene content (i.e. loss of genes from a chromosomal region). For more information about the post-tetraploid genome evolution of Arabidopsis, please see this.