Plasmodia comparative genomics
Plasmodium parasites have unique genomic features that make them an interesting case study in comparative genomics. In the last decade, the number of genome sequences for species of the Plasmodium genus has markedly increased. The availability of multiple Plasmodium genomes open the possibility to explore the genus genomic features and characteristics, and how these are shaped by evolutionary relationships. As thus, open-ended comparative analysis workflows represent a relevant approach to the study of parasites of the genus Plasmodium. Many of GoGe‘s tools and services can be used to perform individual comparative analysis or in combination to evaluate evolutionary hypothesis. In the following pages, we will highlight the use of these tools on the case study of Plasmodium spp.
FOR IN-DEPTH ANALYSES WORKFLOWS OF PLASMODIUM GENOMES FOLLOW THESE LINKS:
The genomes of two Plasmodium species, falciparum and knowlesi are structurally very similar to one another:
|Organism||Chromosome count||Genome Length||CDS count||Genome GC content||CDS GC content||CDS Wobble position content||non-coding GC content|
|Plasmodium falicparum||14||22,860,235 bp||5267||19.88%||23.72%||17.30%||14.58%|
|Plasmodium knowlesi||14||23,462,187 bp||5102||38.94%||40.23%||45.56%||35.12%|
Their physical structure is also very similar, as can be seen in a syntenic dotplot of their genomes. However, their GC content is very different. P. falciparum‘s overall GC content is 23% while P. knowlesi is 39%. Based on the similarities of their genomes' structures, this change in GC content is relatively recent, occurring after their lineages diverged between 2,000,000-10,000 years ago . This change in their overall GC content is reflected in histograms of their respective CDS sequences, and their underlying codon and amino acid usages. Using syntenic gene pairs identified by their whole genome syntenic dotplot, protein alignments were generated and back translated to codon sequence alignments, and their entire data-set was used to calculate the log-odds score frequency of codon substitutions . This substitution matrix is not symmetric. Each codon in each species has a different likelihood of being substituted than it being substituted back. This is a reflection of the apparent directionality in the GC content change.