Difference between revisions of "How to add a genome from Phytozome or JGI"

From CoGepedia
Jump to: navigation, search
Line 1: Line 1:
 
One question we get frequently is how to add data from other genome repositories.  This tutorial is for JGI, but should work for most places.
 
One question we get frequently is how to add data from other genome repositories.  This tutorial is for JGI, but should work for most places.
  
 +
===Instructions to load data===
 
# Go to [http://www.phytozome.net/ JGI/Phytozome]
 
# Go to [http://www.phytozome.net/ JGI/Phytozome]
 
# Go to the download section for your genome of interest
 
# Go to the download section for your genome of interest
Line 34: Line 35:
 
## Note: Loading these annotations may take a while
 
## Note: Loading these annotations may take a while
 
## [[File:Screen Shot 2014-10-16 at 3.56.57 PM.png|thumb|center|100px]]
 
## [[File:Screen Shot 2014-10-16 at 3.56.57 PM.png|thumb|center|100px]]
 +
# You will get a thumbs up when the loading is complete and successful
 +
## [[File:Screen Shot 2014-10-17 at 8.56.03 AM.png|thumb|center|100px]]
 +
'''Loading Complete'''
 +
===Testing and making the genome public===
 +
Time to make sure that the genome and annotations loaded correct before making the genome public
 +
# Go to [[GenomeInfo]] for your newly loaded genome
 +
## [[File:Screen Shot 2014-10-17 at 8.59.29 AM.png|thumb|center|100px]]
 +
# Check that there are sequences loaded by looking at the statistics for the genome
 +
# Check that the annotations loaded correctly by clicking "Click for features" under features
 +
## [[File:Screen Shot 2014-10-17 at 9.01.54 AM.png|thumb|center|100px]]

Revision as of 10:03, 17 October 2014

One question we get frequently is how to add data from other genome repositories. This tutorial is for JGI, but should work for most places.

Instructions to load data

  1. Go to JGI/Phytozome
  2. Go to the download section for your genome of interest
  3. Select and download the assembly (fasta) and annotations (GFF)
    1. Note: Select the gff file that contains both genes and exons "xxx.genes_exons.gff.gz"
    2. Screen Shot 2014-10-16 at 2.16.04 PM.png
    3. Once downloaded, you can send these files to your iPlant Data Store and put them into the coge_data directory (to which CoGe has access), or you can upload them from your desktop. We recommend using the iPlant Data Store due to its more robust file transfer methods. However, your desktop should be fine as long as you have a decent bandwidth.
  4. Log into CoGe
  5. Go to your user profile page
  6. Press "Create" and select "New Genome"
    1. Screen Shot 2014-10-16 at 2.18.56 PM.png
  7. This will take you to LoadGenome (or a popup in your profile for LoadGenome
  8. Fill out the necessary information about the genome.
    1. Note: I like to keep a newly load genome private until I had a chance to make sure the data loaded correctly. Then I make it public.
    2. Note: You can add a link to the data file. Unfortunately, JGI's download is behind an API so a directly link cannot be generated.
    3. Screen Shot 2014-10-16 at 2.27.03 PM.png
  9. Add (upload) the assembly in fasta format.
    1. Screen Shot 2014-10-16 at 2.42.45 PM.png
    2. Screen Shot 2014-10-16 at 2.42.59 PM.png
  10. Press the red "Load Genome" button located at the bottom of the data selections
  11. A popup will appear stating that your genome is being loaded
    1. Screen Shot 2014-10-16 at 2.44.39 PM.png
  12. When the genome has been loaded successfully, you'll get a thumbs-up and a link to go to GenomeInfo
    1. Screen Shot 2014-10-16 at 3.49.53 PM.png
    2. GenomeInfo will appear in a popup in your profile page or in a new window. GenomeInfo lets you get detailed information about your genome, manage data about the genome, send it to CoGe's analysis tools, and associate new data to the genome.
  13. To add structural annotations to your genome, press "Load Gene Annotations"
    1. This will popup (or a load new window) of LoadExperiment.
    2. LoadExperiment will ask you to add information about your genome and to upload a GFF file for the annotations.
  14. Add information about the annotation file and add your GFF file
    1. Screen Shot 2014-10-16 at 3.53.38 PM.png
    2. Note: the gff file can be compressed in gzip
  15. Press the red "Load Annotation" button to start loading the annotations
    1. Note: Loading these annotations may take a while
    2. Screen Shot 2014-10-16 at 3.56.57 PM.png
  16. You will get a thumbs up when the loading is complete and successful
    1. Screen Shot 2014-10-17 at 8.56.03 AM.png

Loading Complete

Testing and making the genome public

Time to make sure that the genome and annotations loaded correct before making the genome public

  1. Go to GenomeInfo for your newly loaded genome
    1. Screen Shot 2014-10-17 at 8.59.29 AM.png
  2. Check that there are sequences loaded by looking at the statistics for the genome
  3. Check that the annotations loaded correctly by clicking "Click for features" under features
    1. Screen Shot 2014-10-17 at 9.01.54 AM.png