Whole genome duplication

From CoGepedia

A whole genome duplication is exactly what it sounds like: an event which creates an organism with additional copies of the entire genome of a species.

Regular cells of most organisms that reproduce sexually contain two copies of their entire genome (one inherited from each parent), a state known at being diploid[1]. A whole genome duplication might result from an organism that inherited two copies of its genome from each parent (four copies total). A doubling from two to four copies of the genome is known as a tetraploidy.

A general rule of thumb is that organisms with even ploidy numbers (2,4,6,8) can often successfully reproduce, but organisms with odd ploidy numbers(3,5, etc) will be unable to.


Examples of different levels of ploidy

These are examples of recent whole genome duplications.

  • Monoploid (one copy).
  • Diploid (two copies). Most sexually reproducing organisms, from
  • Triploid (three copies). Includes most cultivated bananas and seedless watermelons. Triploids generally cannot reproduce sexually which can be valuable as a way of removing seeds. In the case of bananas reproduction is accomplished by vegetative propagation. Seedless watermelons are created by mating tetaploid and diploid watermelons, creating offspring that receive two genome copies from one parent and only one copy from the other. Some varieties of apple are also triploid and must be fertilized by diploid pollen in order to develop fruit.
  • Tetraploid (four copies). Includes the most common variety of domesticated potato Solanum tuberosum.
  • Pentaploid (five copies). Pentaploid individuals will general be sterile.
  • Hexaploid (six copies). Bread wheat ( Triticum aestivum ) is an example of a hexaploid species.
  • Heptaploid (seven copies)
  • Octaploid (eight copies) Includes the most common variety of cultivated strawberry (F. × ananassa)
  • Nonaploid (nine copies)
  • Decaploid (ten copies)

Paleopolyploidy (ancient whole genome duplications)

As of late 2012, all sequenced flowering plant species have at least one detected whole genome duplication in their evolutionary history. Lots of duplicate genes created by the whole genome duplication are later lost through fractionation.

Phylogenetic tree of plant species with sequenced genomes, with ancient whole genome duplications marked. Grass species are the only sequenced representatives of the monocot plant lineage, and all published eudicot genomes come from species in the rosid family.

For details on every whole genome duplication referenced in the above figure see the dedicated plant paleopolyploidy page

Detecting paleopolyploidy

One common way to detect whole genome duplications is through syntenic dotplots. CoGe's tools SynMap provides an easy to use interface for generating syntenic dotplots between any two genomes stored in the CoGe database. For a detailed example of syntenic analysis between a genome containing a paleopolyploidy/ancient whole genome duplication and an outgroup, read about the syntenic analysis of the maize and sorghum genomes

This example, shows the results from SynMap to detect two ancient shared whole genome duplication events in the lineage of Arabidopsis.

Figure 1a: Syntenic dotplot between Arabidopsis thaliana and Arabidopsis lyrata. Syntenic gene pairs identified by DAGChainer have been colored based on their synonymous rate change as calculated by CODEML. Results can be regenerated here.

See Also


  1. Di- meaning two, and ploid from the greek word ploos meaning folded