Fish Comparative Genomics

From CoGepedia
Jump to: navigation, search

Summary/Abstract

Introduction

  • Phylogeny of genomes with polyploidy events marked



Sample Datasets

To access sample datasets

Setting the hook: getting data loaded into CoGe

  • Loading new genomes into CoGe
    • Loading genomes into the Bio.ci (iPlant Data Store) for the first time
      • Sign up for an account or log in
      • Access the Data Store
      • Direct upload from your computer
        • Press the upload tab on the top left and select 'Simple upload from Desktop'
        • Find the genome(s) FASTA file and select
        • Enjoy a picnic outside while they upload
      • Using FTP/HTTP links for uploading genomes in the Data Store
        • 1) In the top left of the screen click on the 'Upload' button
          • a) Select 'Import using URL'
          • b) This will create several spaces to paste URLs into so that the Data Store will retrieve files (in this case genome FASTA files/.fa files). Once the blank areas come up, open a new tab in your browser
        • 2) copy the link location of the genome FASTA file from an FTP site or webpage
        • 3) paste the URL into a blank on the Data Store page
        • Enjoy some tea or coffee while they upload
    • Once the genomes have been uploaded into the Data Store
      • Select them and move them into the folder names 'coge_data'
      • The genome should now be visible in the CoGe platform
    • Direct upload


  • OR Use genomes already available in CoGe



Sequenced fish genomes

Table of fish genomes publicly available in CoGe. Bold denotes species with annotations.
Class Species Common name Genome ID Species notes CoGe Genome Links

Hyperoartia

(jawless fish)

Lethenteron camtschaticum Artic lamprey 16905 One of few extant jawless fish unmasked masked
Petromyzon marinus Sea Lamprey 287 One of few extant jawless fish unmasked masked
Chondrichthyes

(cartilaginous fish)

Callorhinchus milii Australian Ghostshark 689 Proposed model cartilaginous fish unmasked masked
Sarcopterygii

(lobe-fin fish)

Latimeria chalumnae Coelacanth 3262 Oldest extant Sarcopterygii unmasked masked

Actinopterygii

(ray-fin fish)

Anguilla japonica Japanese Eel 13349 Ecological model, Industry species unmasked masked
Anoplopoma fimbria Sablefish 12760 Industrial species unmasked masked
Astyanax mexicanus Mexican tetra 13073 Two forms: seeing, and cave-dwelling blind unmasked masked
Cynoglossus semilaevis Tongue Sole 11788 Industrial species unmasked masked
Cyprinodon variegatus Sheepshead minnow 13078 Toxicology model unmasked masked
Danio rerio Zebrafish 50 Model species unmasked masked
Dicentrarchus labrax European Seabass 2659 Industrial species unmasked masked
Esox lucius Northern Pike 22932 Angling species unmasked masked
Gadus morhua Atlantic cod 2661 Industrial species unmasked masked
Gasterosteus aculeatus Three-spined stickleback 146 Model species unmasked masked
Haplochromis burtoni Burton’s mouthbrooder 3328 Ecological model species, Aquarium species unmasked masked
Labeotropheus fuelleborni Blue mbuna 2638
unmasked masked
Lepisosteus oculatus Spotted Gar 10597 Lineage before Teleost whole genome duplication unmasked masked
Maylandia zebra Zebra Mbuna 2640 Ecological model species, aquarium fish unmasked masked
Mchenga conophoros
2585
unmasked masked
Melanochromis auratus Auratus Cichlid 2639
unmasked masked
Neolamprologus brichardi Princess Cichlid 3329 Aquarium fish unmasked masked
Nothobranchius furzeri Turquoise Killifish 2642 Short life span model species, metabolic diapause unmasked masked
Nothobranchius kuhntae Beira Killifish 2643 Short life model species unmasked masked
Oncorhynchus mykiss Rainbow Trout Genoscope Salmon Database Angling and industrial species unmasked masked
Oreochromis niloticus Nile Tilapia 197 Industrial species unmasked masked
Oryzias latipes Japanese Medaka 542 Model fish, Aquarium species unmasked masked
Pimephales promelas Fathead Minnow 13167 Industrial baitfish unmasked masked
Poecilia formosa Amazon Molly 13072 Gynogenesis: all female populations unmasked masked
Poecilia reticulata Guppy 23338 Model species, Aquarium fish unmasked masked
Pundamilia nyererei
3330
unmasked masked
Salmo salar Atlantic salmon [ NCBI Bioproject] Angling species, Industrial species unmasked masked
Sebastes aleutianus Rougheye Rockfish NCBI Bioproject Angling species, Industrial species, Does not age/senesce soft masked
Sebastes nigrocinctus Tiger Rockfish 14568 Angling species, long lived model, live bearer masked
Sebastes rubrivinctus Flag Rockfish 11458 Angling species, Industrial species unmasked masked
Stegastes partitus Bicolor damselfish 13077 Medical model species unmasked masked
Takifugu flavidus Sansaifugu 14185
unmasked masked
Takifugu rubripes Torafugu 63 Shortest vertebrate genome known unmasked masked
Tetraodon nigroviridis Spotted Green Pufferfish 191 Model species, low amount of repetitive sequence unmasked masked
Thunnus orientalis Pacific Bluefin Tuna 13314 Industrial species unmasked masked

Xiphophorus maculatus Southern Platyfish 10764 Live bearer unmasked masked



Links to CoGe Notebooks containing fish genomes

All annotated unmasked fish genomes available in CoGe

All unmasked fish genomes available in CoGe

Glossary

Casting the Line: analyses for comparing genomes

Whole genome syntenic analysis

  • SynMap (tutorial)
    • Identify Whole Genome Duplications
      • Example 1: Whole genome synteny analysis of Oncorhynchus mykiss (rainbow trout) compared to Takifugu rubripes. In this analysis, the synonymous mutation rate (Ks) has been calculated to determine the relative age of each syntenic gene pair (represented as a dot) in the dotplot. The blue/green color indicates a newer whole genome duplication whereas the red/orange dots are noise in the dataset. The histogram below the dotplot shows the distribution of the synonymous mutation rates


Microsyntenic analysis

  • GEvo
    • Validate microsynteny
      • Example 1: Microsynteny analysis using GeVo to compare T. rubripes to O. mykiss. This analysis shows evidence of whole genome duplication in O. mykiss (the Salmonid WGD) when compared to another Teleost fish that does not have a WGD after the Teleost WGDfff.
    • Identify Conserved non-coding sequences (regulatory function)

Ortholog/paralog finding with synteny

  • SynFind
    • Identify orthologous regions across many species

Sinkers to Cast Further and Deeper: adding weight to genomes with additional data types

Adding new data (genomes, RNASeq, SNPs) to CoGe

Gene family analysis

  • CoGeBlast
    • Identify many gene family members within/across species
    • Extract sequences (nucleotide/protein)
    • Phylogenetic tree reconstruction using iPlant/iAnimal for multiple sequence alignment and tree building

Functional genomics

  • Adding/visualizing RNAseq data

Discussion/Conclusions