Fish Comparative Genomics: Difference between revisions

From CoGepedia
Jump to navigation Jump to search
No edit summary
 
(75 intermediate revisions by 2 users not shown)
Line 14: Line 14:
<br>
<br>


== Sample Datasets ==
To access sample datasets
*''Takifugu rubripes'' genome (http://de.iplantcollaborative.org/dl/d/05199A88-82ED-4D09-94AA-99CF3D8B64FE/Takifugu_rubripes_genome_NCBIv1.faa)
*''Takifugu rubripes'' genome annotation (http://de.iplantcollaborative.org/dl/d/989C12FD-D62C-48B5-8FB8-80706BCB028D/T_rubripes_annotation_GCF_000180615.1.gff.gz)


== Setting the hook: getting data loaded into CoGe  ==
== Setting the hook: getting data loaded into CoGe  ==


*Loading new genomes into CoGe  
*Loading new genomes into CoGe  
**Loading from the iPlant Data Store When loading a new genome
**Loading genomes into the Bio.ci (iPlant Data Store) for the first time
**Using FTP/HTTP links for uploading genomes  
***[http://user.iplantcollaborative.org/ Sign up for an account or log in]
***Access the [https://de.iplantcollaborative.org/de/ Data Store]
***Direct upload from your computer
****Press the upload tab on the top left and select 'Simple upload from Desktop'
****Find the genome(s) FASTA file and select
****Enjoy a picnic outside while they upload
***Using FTP/HTTP links for uploading genomes in the Data Store
****1) In the top left of the screen click on the 'Upload' button
*****a) Select 'Import using URL'
*****b) This will create several spaces to paste URLs into so that the Data Store will retrieve files (in this case genome FASTA files/.fa files). Once the blank areas come up, open a new tab in your browser
****2) copy the link location of the genome FASTA file from an FTP site or webpage
****3) paste the URL into a blank on the Data Store page
****Enjoy some tea or coffee while they upload
**Once the genomes have been uploaded into the Data Store
***Select them and move them into the folder names 'coge_data'
***The genome should now be visible in the CoGe platform
 
**Direct upload  
**Direct upload  
**NCBI Loader <br>


<br>  
**[[CoGe NCBI Loader]] <br>


*Table of fish genomes publicly available in CoGe


{| width="1075" cellspacing="1" cellpadding="1" border="1" height="1387" |- | &lt;b&gt;Class&lt;/b&gt; | &lt;b&gt;Order&lt;/b&gt; | &lt;b&gt;Genus species &lt;/b&gt; | &lt;b&gt;Common name&lt;/b&gt; | &lt;b&gt;Notes&lt;/b&gt; | &lt;b&gt;CoGe Genome Links&lt;/b&gt; |- | Cephalaspidomorphi | Petromyzontiformes | Lethenteron camtschaticum | Artic lamprey | One of few extant jawless fish | |- | | | Petromyzon marinus | Sea lamprey | One of few extant jawless fish | |- | Chondrichthyes | Chimaeriformes | Callorhinchus milii | Australian ghostshark | Proposed model cartilaginous fish | |- | Sarcopterygii | Coelacanthiformes | Latimeria chalumnae | Coelacanth | Oldest known extant Sarcopterygii | |- | Actinopterygii | Anguilliformes | Anguilla japonica | Japanese eel | Ecological model, Industry species | |- | | Beloniformes | Oryzias latipes | Medaka | Model fish, Japanese pet fish | |- | | Characiformes | Astyanax mexicanus | Mexican tetra | Two forms: seeing, and cave-dwelling blind | |- | | Cichliformes | Haplochromis burtoni | Burton’s mouthbrooder | Ecological model species, Aquarium species | |- | | | Labeotropheus fuelleborni | Blue mbuna | | |- | | | Maylandia zebra | Zebra mbuna | Ecological model species, aquarium fish | |- | | | Mchenga conophoros | | | |- | | | Melanochromis auratus | Auratus cichlid | Phenotypic and ecological diversity | |- | | | Neolamprologus brichardi | Princess cichlid | Aquarium fish | |- | | | Oreochromis niloticus | Nile tilapia | Industrial species | |- | | | Pundamilia nyererei | | | |- | | Cypriniformes | Danio rerio | Zebrafish | Model species | |- | | | Pimephales promelas | Fathead minnow | Industrial baitfish | |- | | Cyprinodontiformes | Cyprinodon variegatus | Sheepshead minnow | Toxicology model | |- | | | Nothobranchius furzeri | Turquoise killifish | Short life span model species, metabolic diapause | |- | | | Nothobranchius kuhntae | Beira killifish | Short life model species | |- | | | Poecilia formosa | Amazon molly | Gynogenesis: all female populations | |- | | | Poecilia reticulata | guppy | Model species, aquarium fish | |- | | | Xiphophorus maculatus | Southern platyfish | Gives live birth | |- | | Esociformes | Esox lucius | Northern pike | Angling species | |- | | Gadiformes | Gadus morhua | Atlantic cod | Industrial species | |- | | Perciformes | Anoplopoma fimbria | Sablefish | Industrial species | |- | | | Dicentrarchus labrax | European seabass | Industrial species | |- | | | Gasterosteus aculeatus | Three-spined stickleback | Model species | |- | | | Sebastes nigrocinctus | Tiger rockfish | Angling species, long lived model, live bearer | |- | | | Sebastes rubrivinctus | Flag rockfish | Angling species, industrial species | |- | | | Stegastes partitus | Bicolor damselfish | Medical model species | |- | | Pleuronectiformes | Cynoglossus semilaevis | Tongue sole | Industrial species | |- | | Salmoniformes | Oncorhyncus mykiss | Rainbow trout | Angling species, Indrustrial species | |- | | Semionotiformes | Lepisosteus oculatus | Spotted gar | | |- | | Scombriformes | Thunnus orientalis | Pacific bluefin tuna | Industrial species | |- | | Tetraodontiformes | Takifugu flavidus | Sansaifugu | | |- | | | Tetraodon nigroviridis | Spotted green pufferfish | Model species, low amount of repetitive sequence | |- | | | Takifugu rubripes | ‘Fugu’ pufferfish | Shortest vertebrate genome | |} <br>
*'''''OR''''' Use genomes already available in CoGe
**Select your favorite fish genome from the table below
**Or [https://genomevolution.org/CoGe/OrganismView.pl search CoGe] for other organisms using [[OrganismView]]


<br>  
<br>  


<br>  
<br>  
===Sequenced fish genomes===
{| width="1203" cellspacing="1" cellpadding="3" border="1"
|+ Table of fish genomes publicly available in CoGe. Bold denotes species with annotations.
|-
! scope="col" | '''Class'''
! scope="col" | '''Species'''
! scope="col" | '''Common name'''
! scope="col" | '''Genome ID'''
! scope="col" | '''Species notes'''
! scope="col" | '''CoGe Genome Links'''
|-
! rowspan="2" scope="col" |
Hyperoartia
(jawless fish)
| ''Lethenteron camtschaticum''
| Artic lamprey
| [http://www.ncbi.nlm.nih.gov/genome/16905 16905]
| One of few extant jawless fish
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24836 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24963 masked]
|-
| '''''Petromyzon marinus'''''
| Sea Lamprey
| [http://www.ncbi.nlm.nih.gov/genome/287 287]
| One of few extant jawless fish
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=12390 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24937 masked]
|-
! scope="col" | Chondrichthyes
(cartilaginous fish)
| '''''Callorhinchus milii'''''
| Australian Ghostshark
| [http://www.ncbi.nlm.nih.gov/genome/689 689]
| Proposed model cartilaginous fish
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25133 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25135 masked]
|-
! scope="col" | Sarcopterygii
(lobe-fin fish)


<br>  
| '''''Latimeria chalumnae'''''
| Coelacanth
| [http://www.ncbi.nlm.nih.gov/genome/3262 3262]
| Oldest extant Sarcopterygii
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25005 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25041 masked]
|-
! rowspan="34" scope="col" |
Actinopterygii
 
(ray-fin fish)
 
| ''Anguilla japonica''
| Japanese Eel
| [http://www.ncbi.nlm.nih.gov/genome/13349 13349]
| Ecological model, Industry species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24876 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24956 masked]
|-
| ''Anoplopoma fimbria''
| Sablefish
| [http://www.ncbi.nlm.nih.gov/genome/12760 12760]
| Industrial species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24878 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25180 masked]
|-
| '''''Astyanax mexicanus'''''
| Mexican tetra
| [http://www.ncbi.nlm.nih.gov/genome/13073 13073]
| Two forms: seeing, and cave-dwelling blind
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25134 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25137 masked]
|-
| '''''Cynoglossus semilaevis'''''
| Tongue Sole
| [http://www.ncbi.nlm.nih.gov/genome/11788 11788]
| Industrial species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25000 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25036 masked]
|-
| ''Cyprinodon variegatus''
| Sheepshead minnow
| [http://www.ncbi.nlm.nih.gov/genome/13078 13078]
| Toxicology model
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24911 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24958 masked]
|-
| '''''Danio rerio'''''
| Zebrafish
| [http://www.ncbi.nlm.nih.gov/genome/50 50]
| Model species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25001 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25037 masked]
|-
| ''Dicentrarchus labrax''
| European Seabass
| [http://www.ncbi.nlm.nih.gov/genome/2659 2659]
| Industrial species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24873 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25159 masked]
|-
| '''''Esox lucius'''''
| Northern Pike
| [http://www.ncbi.nlm.nih.gov/genome/22932 22932]
| Angling species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25161 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25181 masked]
|-
| ''Gadus morhua''
| Atlantic cod
| [http://www.ncbi.nlm.nih.gov/genome/2661 2661]
| Industrial species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=14730 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24957 masked]
|-
| '''''Gasterosteus aculeatus'''''
| Three-spined stickleback
| [http://www.ncbi.nlm.nih.gov/genome/146 146]
| Model species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=23854 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25191 masked]
|-
| '''''Haplochromis burtoni'''''
| Burton’s mouthbrooder
| [http://www.ncbi.nlm.nih.gov/genome/3328 3328]
| Ecological model species, Aquarium species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25002 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25062 masked]
|-
| ''Labeotropheus fuelleborni''
| Blue mbuna
| [http://www.ncbi.nlm.nih.gov/genome/2638 2638]
| <br>
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24912 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24961 masked]
|-
| '''''Lepisosteus oculatus'''''
| Spotted Gar
| [http://www.ncbi.nlm.nih.gov/genome/10597 10597]
| Lineage before Teleost whole genome duplication
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25003 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25032 masked]
|-
| '''''Maylandia zebra'''''
| Zebra Mbuna
| [http://www.ncbi.nlm.nih.gov/genome/2640 2640]
| Ecological model species, aquarium fish
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25004 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25046 masked]
|-
| ''Mchenga conophoros''
| <br>
| [http://www.ncbi.nlm.nih.gov/genome/2585 2585]
| <br>
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24914 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24919 masked]
|-
| ''Melanochromis auratus''
| Auratus Cichlid
| [http://www.ncbi.nlm.nih.gov/genome/2639 2639]
| <br>
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24913 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24964 masked]
|-
| '''''Neolamprologus brichardi'''''
| Princess Cichlid
| [http://www.ncbi.nlm.nih.gov/genome/3329 3329]
| Aquarium fish
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25006 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25044 masked]
|-
| ''Nothobranchius furzeri''
| Turquoise Killifish
| [http://www.ncbi.nlm.nih.gov/genome/2642 2642]
| Short life span model species, metabolic diapause
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=15571 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24920 masked]
|-
| ''Nothobranchius kuhntae''
| Beira Killifish
| [http://www.ncbi.nlm.nih.gov/genome/2643 2643]
| Short life model species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=15184 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24921 masked]
|-
| ''Oncorhynchus mykiss''
| Rainbow Trout
| [http://www.genoscope.cns.fr/trout-ggb/data/ Genoscope Salmon Database]
| Angling and industrial species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25073 unmasked] [https://genomevolution.org/coge/GenomeInfo.pl?gid=25267 masked]
|-
| '''''Oreochromis niloticus'''''
| Nile Tilapia
| [http://www.ncbi.nlm.nih.gov/genome/197 197]
| Industrial species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25007 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25033 masked]
|-
| '''''Oryzias latipes'''''
| Japanese Medaka
| [http://www.ncbi.nlm.nih.gov/genome/542 542]
| Model fish, Aquarium species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25008 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25031 masked]
|-
| ''Pimephales promelas''
| Fathead Minnow
| [http://www.ncbi.nlm.nih.gov/genome/13167 13167]
| Industrial baitfish
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24916 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24947 masked]
|-
| '''''Poecilia formosa'''''
| Amazon Molly
| [http://www.ncbi.nlm.nih.gov/genome/13072 13072]
| Gynogenesis: all female populations
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25009 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25063 masked]
|-
| '''''Poecilia reticulata'''''
| Guppy
| [http://www.ncbi.nlm.nih.gov/genome/23338 23338]
| Model species, Aquarium fish
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25010 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25047 masked]
|-
| '''''Pundamilia nyererei'''''
| <br>
| [http://www.ncbi.nlm.nih.gov/genome/3330 3330]
| <br>
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25011 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25055 masked]
|-
| '''''Salmo salar'''''
| Atlantic salmon
| [ NCBI Bioproject]
| Angling species, Industrial species
| [https://genomevolution.org/coge/GenomeInfo.pl?gid=28938 unmasked] [https://genomevolution.org/coge/GenomeInfo.pl?gid=29012 masked]
|-
| '''''Sebastes aleutianus'''''
| Rougheye Rockfish
| [http://www.ncbi.nlm.nih.gov/bioproject?term=txid214485 NCBI Bioproject]
| Angling species, Industrial species, Does not age/senesce
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24053 soft masked]
|-
| '''''Sebastes nigrocinctus'''''
| Tiger Rockfish
| [http://www.ncbi.nlm.nih.gov/genome/14568 14568]
| Angling species, long lived model, live bearer
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24936 masked]
|-
| ''Sebastes rubrivinctus''
| Flag Rockfish
| [http://www.ncbi.nlm.nih.gov/genome/11458 11458]
| Angling species, Industrial species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24886 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25162 masked]
|-
| '''''Stegastes partitus'''''
| Bicolor damselfish
| [http://www.ncbi.nlm.nih.gov/genome/13077 13077]
| Medical model species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25012 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25054 masked]
|-
| ''Takifugu flavidus''
| Sansaifugu
| [http://www.ncbi.nlm.nih.gov/genome/14185 14185]
| <br>
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24910 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24924 masked]
|-
| '''''Takifugu rubripes'''''
| Torafugu
| [http://www.ncbi.nlm.nih.gov/genome/63 63]
| Shortest vertebrate genome known
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25013 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25052 masked]
|-
| '''''Tetraodon nigroviridis'''''
| Spotted Green Pufferfish
| [http://www.ncbi.nlm.nih.gov/genome/191 191]
| Model species, low amount of repetitive sequence
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=23254 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24926 masked]
|-
| ''Thunnus orientalis''
| Pacific Bluefin Tuna
| [http://www.ncbi.nlm.nih.gov/genome/13314 13314]
| Industrial species
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24952 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=24968 masked]
|-
| <br>
| '''''Xiphophorus maculatus'''''
| Southern Platyfish
| [http://www.ncbi.nlm.nih.gov/genome/10764 10764]
| Live bearer
| [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25014 unmasked] [https://genomevolution.org/CoGe/GenomeInfo.pl?gid=25053 masked]
|}
 
<br><br>


*Links to CoGe [[Notebooks]] containing these genomes<br>
== Links to CoGe [[Notebooks]] containing fish genomes ==
**[https://genomevolution.org/CoGe/NotebookView.pl?nid=890 All unmasked fish genomes available in CoGe]<br>  
**[https://genomevolution.org/CoGe/NotebookView.pl?nid=890 All unmasked fish genomes available in CoGe]<br>  
**[ All masked fish genomes available in CoGe]  
**[ All masked fish genomes available in CoGe]  
*Link to [[GenomeList]] for notebooks  
*Link to [[GenomeList]] for notebooks  
**[ GenomeList for all fish genomes in CoGe]
**[ GenomeList for all fish genomes in CoGe]
[https://genomevolution.org/r/egv4 All annotated unmasked fish genomes available in CoGe]
[https://genomevolution.org/r/egv5 All unmasked fish genomes available in CoGe]


== Glossary ==
== Glossary ==
Line 51: Line 345:
*[[In-Paralog]]
*[[In-Paralog]]


== Analyses ==
== Casting the Line: analyses for comparing genomes ==


=== Whole genome syntenic analysis ===
=== Whole genome syntenic analysis ===


*[[SynMap]]  
*[[SynMap]] [https://genomevolution.org/wiki/index.php/Maize_Sorghum_Syntenic_dotplot (tutorial)]
**Identify Whole Genome Duplications  
**Identify Whole Genome Duplications  
***'''Example 1:''' Whole genome synteny analysis of [https://genomevolution.org/r/eola ''Oncorhynchus mykiss'' (rainbow trout) compared to ''Takifugu rubripes'']. In this analysis, the [[synonymous mutation]] rate (Ks) has been calculated to determine the relative age of each syntenic gene pair (represented as a dot) in the dotplot. The blue/green color indicates a newer whole genome duplication whereas the red/orange dots are noise in the dataset. The histogram below the dotplot shows the distribution of the synonymous mutation rates
**Identify Synteny  
**Identify Synteny  
**Synonymous/nonsynonymous gene pair evolution
**Synonymous/nonsynonymous gene pair evolution
***[https://genomevolution.org/r/eq6b Non-synonymous mutation rates for syntenic gene pairs between ''O. mykiss'' and ''T. rubripes'']
***[https://genomevolution.org/r/eq69 Ka/Ks rates for syntenic gene pairs between ''O. mykiss'' and ''T. rubripes'']


=== Microsyntenic analysis ===
=== Microsyntenic analysis ===
Line 64: Line 363:
*[[GEvo]]  
*[[GEvo]]  
**Validate microsynteny  
**Validate microsynteny  
***'''Example 1:''' Microsynteny analysis using GeVo to compare [https://genomevolution.org/r/eoln ''T. rubripes'' to ''O. mykiss'']. This analysis shows evidence of whole genome duplication in ''O. mykiss'' (the Salmonid WGD) when compared to another Teleost fish that does not have a WGD after the Teleost WGDfff.
**Identify Conserved non-coding sequences (regulatory function)
**Identify Conserved non-coding sequences (regulatory function)


Line 70: Line 371:
*[[SynFind]]  
*[[SynFind]]  
**Identify orthologous regions across many species
**Identify orthologous regions across many species
== Sinkers to Cast Further and Deeper: adding weight to genomes with additional data types ==
=== Adding new data (genomes, RNASeq, SNPs) to CoGe ===
*[[LoadGenome]]
*[[LoadExperiment]]
*Keeping data private and sharing with collaborators


=== Gene family analysis ===
=== Gene family analysis ===
Line 82: Line 391:
*Adding/visualizing RNAseq data
*Adding/visualizing RNAseq data


=== Adding new data (genomes, RNASeq, SNPs) to CoGe ===
== Discussion/Conclusions ==


*[[LoadGenome]]
*[[LoadExperiment]]
*Keeping data private and sharing with collaborators
== Discussion/Conclusions ==


<br>
<br>

Latest revision as of 18:26, 6 May 2016

Summary/Abstract

Introduction

  • Phylogeny of genomes with polyploidy events marked



Sample Datasets

To access sample datasets

Setting the hook: getting data loaded into CoGe

  • Loading new genomes into CoGe
    • Loading genomes into the Bio.ci (iPlant Data Store) for the first time
      • Sign up for an account or log in
      • Access the Data Store
      • Direct upload from your computer
        • Press the upload tab on the top left and select 'Simple upload from Desktop'
        • Find the genome(s) FASTA file and select
        • Enjoy a picnic outside while they upload
      • Using FTP/HTTP links for uploading genomes in the Data Store
        • 1) In the top left of the screen click on the 'Upload' button
          • a) Select 'Import using URL'
          • b) This will create several spaces to paste URLs into so that the Data Store will retrieve files (in this case genome FASTA files/.fa files). Once the blank areas come up, open a new tab in your browser
        • 2) copy the link location of the genome FASTA file from an FTP site or webpage
        • 3) paste the URL into a blank on the Data Store page
        • Enjoy some tea or coffee while they upload
    • Once the genomes have been uploaded into the Data Store
      • Select them and move them into the folder names 'coge_data'
      • The genome should now be visible in the CoGe platform
    • Direct upload


  • OR Use genomes already available in CoGe



Sequenced fish genomes

Table of fish genomes publicly available in CoGe. Bold denotes species with annotations.
Class Species Common name Genome ID Species notes CoGe Genome Links

Hyperoartia

(jawless fish)

Lethenteron camtschaticum Artic lamprey 16905 One of few extant jawless fish unmasked masked
Petromyzon marinus Sea Lamprey 287 One of few extant jawless fish unmasked masked
Chondrichthyes

(cartilaginous fish)

Callorhinchus milii Australian Ghostshark 689 Proposed model cartilaginous fish unmasked masked
Sarcopterygii

(lobe-fin fish)

Latimeria chalumnae Coelacanth 3262 Oldest extant Sarcopterygii unmasked masked

Actinopterygii

(ray-fin fish)

Anguilla japonica Japanese Eel 13349 Ecological model, Industry species unmasked masked
Anoplopoma fimbria Sablefish 12760 Industrial species unmasked masked
Astyanax mexicanus Mexican tetra 13073 Two forms: seeing, and cave-dwelling blind unmasked masked
Cynoglossus semilaevis Tongue Sole 11788 Industrial species unmasked masked
Cyprinodon variegatus Sheepshead minnow 13078 Toxicology model unmasked masked
Danio rerio Zebrafish 50 Model species unmasked masked
Dicentrarchus labrax European Seabass 2659 Industrial species unmasked masked
Esox lucius Northern Pike 22932 Angling species unmasked masked
Gadus morhua Atlantic cod 2661 Industrial species unmasked masked
Gasterosteus aculeatus Three-spined stickleback 146 Model species unmasked masked
Haplochromis burtoni Burton’s mouthbrooder 3328 Ecological model species, Aquarium species unmasked masked
Labeotropheus fuelleborni Blue mbuna 2638
unmasked masked
Lepisosteus oculatus Spotted Gar 10597 Lineage before Teleost whole genome duplication unmasked masked
Maylandia zebra Zebra Mbuna 2640 Ecological model species, aquarium fish unmasked masked
Mchenga conophoros
2585
unmasked masked
Melanochromis auratus Auratus Cichlid 2639
unmasked masked
Neolamprologus brichardi Princess Cichlid 3329 Aquarium fish unmasked masked
Nothobranchius furzeri Turquoise Killifish 2642 Short life span model species, metabolic diapause unmasked masked
Nothobranchius kuhntae Beira Killifish 2643 Short life model species unmasked masked
Oncorhynchus mykiss Rainbow Trout Genoscope Salmon Database Angling and industrial species unmasked masked
Oreochromis niloticus Nile Tilapia 197 Industrial species unmasked masked
Oryzias latipes Japanese Medaka 542 Model fish, Aquarium species unmasked masked
Pimephales promelas Fathead Minnow 13167 Industrial baitfish unmasked masked
Poecilia formosa Amazon Molly 13072 Gynogenesis: all female populations unmasked masked
Poecilia reticulata Guppy 23338 Model species, Aquarium fish unmasked masked
Pundamilia nyererei
3330
unmasked masked
Salmo salar Atlantic salmon [ NCBI Bioproject] Angling species, Industrial species unmasked masked
Sebastes aleutianus Rougheye Rockfish NCBI Bioproject Angling species, Industrial species, Does not age/senesce soft masked
Sebastes nigrocinctus Tiger Rockfish 14568 Angling species, long lived model, live bearer masked
Sebastes rubrivinctus Flag Rockfish 11458 Angling species, Industrial species unmasked masked
Stegastes partitus Bicolor damselfish 13077 Medical model species unmasked masked
Takifugu flavidus Sansaifugu 14185
unmasked masked
Takifugu rubripes Torafugu 63 Shortest vertebrate genome known unmasked masked
Tetraodon nigroviridis Spotted Green Pufferfish 191 Model species, low amount of repetitive sequence unmasked masked
Thunnus orientalis Pacific Bluefin Tuna 13314 Industrial species unmasked masked

Xiphophorus maculatus Southern Platyfish 10764 Live bearer unmasked masked



Links to CoGe Notebooks containing fish genomes

All annotated unmasked fish genomes available in CoGe

All unmasked fish genomes available in CoGe

Glossary

Casting the Line: analyses for comparing genomes

Whole genome syntenic analysis

  • SynMap (tutorial)
    • Identify Whole Genome Duplications
      • Example 1: Whole genome synteny analysis of Oncorhynchus mykiss (rainbow trout) compared to Takifugu rubripes. In this analysis, the synonymous mutation rate (Ks) has been calculated to determine the relative age of each syntenic gene pair (represented as a dot) in the dotplot. The blue/green color indicates a newer whole genome duplication whereas the red/orange dots are noise in the dataset. The histogram below the dotplot shows the distribution of the synonymous mutation rates


Microsyntenic analysis

  • GEvo
    • Validate microsynteny
      • Example 1: Microsynteny analysis using GeVo to compare T. rubripes to O. mykiss. This analysis shows evidence of whole genome duplication in O. mykiss (the Salmonid WGD) when compared to another Teleost fish that does not have a WGD after the Teleost WGDfff.
    • Identify Conserved non-coding sequences (regulatory function)

Ortholog/paralog finding with synteny

  • SynFind
    • Identify orthologous regions across many species

Sinkers to Cast Further and Deeper: adding weight to genomes with additional data types

Adding new data (genomes, RNASeq, SNPs) to CoGe

Gene family analysis

  • CoGeBlast
    • Identify many gene family members within/across species
    • Extract sequences (nucleotide/protein)
    • Phylogenetic tree reconstruction using iPlant/iAnimal for multiple sequence alignment and tree building

Functional genomics

  • Adding/visualizing RNAseq data

Discussion/Conclusions