2015 Plant Genome Evolution Workshop: Difference between revisions

From CoGepedia
Jump to navigation Jump to search
No edit summary
No edit summary
Line 59: Line 59:
** Load Another Genome
** Load Another Genome


===Add Annotations (E. coli) ===
===Add Annotations ===
If you have structural gene models for your genome, you can integrate them.  While many tools can use the full genome, some tools (and some features) require having structural gene models (e.g., CDS).
If you have structural gene models for your genome, you can integrate them.  While many tools can use the full genome, some tools (and some features) require having structural gene models (e.g., CDS).


====Small genome (E. coli)====
* Go to [[LoadAnnotation]]  
* Go to [[LoadAnnotation]]  
** From [[GenomeInfo]] by pressing  "Load Gene Annotations"
** From [[GenomeInfo]] by pressing  "Load Gene Annotations"
Line 79: Line 80:
* '''Note''': The length of time it takes to load annotations depends on the load on the database and the number of annotations being loaded.  For this example (and no load on the server), it should take ~ 3-5 minutes.
* '''Note''': The length of time it takes to load annotations depends on the load on the database and the number of annotations being loaded.  For this example (and no load on the server), it should take ~ 3-5 minutes.


===Add Annotations (Arabidopsis thaliana) ===
====Large genome (Arabidopsis thaliana) ====
* Go to [[LoadAnnotation]]  
* Go to [[LoadAnnotation]]  
** From [[GenomeInfo]] by pressing  "Load Gene Annotations"
** From [[GenomeInfo]] by pressing  "Load Gene Annotations"

Revision as of 22:53, 31 August 2015

Slides

  • Keynote:
  • PDF:
  • Powerpoint:

Register an account/Log in

  • Go to: http://user.iplantcollaborative.org
    • CoGe uses iPlant's Authentication and User Identify Management Service
    • After clicking on the confirmation link provided in the automated email, your account may take a few minutes to propagate to all of iPlant's Authentication Services.
  • Sign-in (link is in top-right of any CoGe page)
  • NOTE: This wiki (CoGePedia) uses a different authentication than CoGe!

Load your own genome

If you are logged into CoGe with your user account, you can add new genomes to CoGe, keep them private, share them with collaborators, and make them fully public.

Small Genome (E. coli)

  • Search for Organism "Escherichia coli K12 strain K-12 substrain MG1655" (just type in "MG1655")
  • Set a version (e.g., "1")
  • Leave "Type:" as "unmasked"
  • Source: "CoGe" or "NCBI"
  • Leave as "Restricted"
  • Press "Next"
  • Select "FTP/HTTP" tab
  • Paste in the link below:
  • Press "Get"
  • Press "Next"
  • Review the data and associated information.
  • Press "Start Loading"
  • Note: The length of time it takes to load a genome depends on the load on the database and the number of chromosomes/contigs being loaded. For this example, it should take a minute or two.
  • Note: When finished, you can select what you want to do next from a drop-down menu:
    • Go to GenomeInfo
    • Load Annotations for the genome
    • Load Another Genome

Medium Genome (Arabidopsis thaliana)

  • Search for Organism "Arabidopsis thaliana Col-0 (thale cress)" (just type in "col-0")
  • Set a version (e.g., "1")
  • Leave "Type:" as "unmasked"
  • Source: "CoGe" or "TAIR"
  • Leave as "Restricted"
  • Press "Next"
  • Select "FTP/HTTP" tab
  • Paste in the link below:
  • Press "Get"
  • Press "Next"
  • Review the data and associated information.
  • Press "Start Loading"
  • Note: The length of time it takes to load a genome depends on the load on the database and the number of chromosomes/contigs being loaded. For this example, it should take a minute or two.
  • Note: When finished, you can select what you want to do next from a drop-down menu:
    • Go to GenomeInfo
    • Load Annotations for the genome
    • Load Another Genome

Add Annotations

If you have structural gene models for your genome, you can integrate them. While many tools can use the full genome, some tools (and some features) require having structural gene models (e.g., CDS).

Small genome (E. coli)

Large genome (Arabidopsis thaliana)


Data Management

  • Go to your User Profile page: https://genomevolution.org/CoGe/User.pl
  • Select a genome by clicking on it.
  • Information about the genome will appear in the right panel
  • You can share a genome by clicking on the person icon
  • You can delete the genome by clicking on the trash can
  • Double-clicking the genome will open the Detailed View for it (GenomeInfo)
  • Share it with the person next to you
  • You can view genomes (and other data) that has been shared with you by clicking on "Shared with me" in the menu on the left.

Adding Experimental Data

EPIC-CoGe is an extension to CoGe that lets you add any type of functional genomics and diversity data sets to CoGe.

RNASeq Processing

  • Note: You can add private experimental data to public genomes (Mix and match public and private data)
  • Go to your User Profile page: https://genomevolution.org/coge/User.pl
  • Select "Create" -> "New Experiment"
  • Add experiment name: (e.g., "RNASeq-test")
  • Add description
  • Add version
  • Add source (e.g., "coge")
  • Keep restricted
  • Search for "Col-0"
    • Make sure to select the version with genome ID 16911
  • Press "Select Data File"
  • Select the "FTP/HTTP" tab
  • Copy in the following link:
  • Press "Get"
  • It will automatically detect that it is a fastq file based on the file name extension
  • Leave the aligner set to "GSNAP" which is faster than Bowtie2
  • Press "Load Experiment"
    • Note: This Fastq file is relatively small and the whole pipeline takes around 2-3 minutes to complete
  • When finished, Load Experiment will create a notebook with three experiments
    • One for the BAM file (alignment)
    • One for reads mapped to nucleotide positions in the genome (read depth)
    • One for reads normalized to transcripts (FPKM)
  • Press "Notebook View" to view the notebook with all three experiments
  • Press "View" to visualize these data in the genome browser (JBrowse)
    • Due to the number of experiments (public) available for Arabidopsis Col-0, it may take JBrowse a while to load.

Find and visualize a genome

Analyses

  • Get the detailed view of your genome (GenomeInfo)
  • Under "Tools" and next to "Analyze", click on the link for "SynMap"
  • Your genome will automatically be populated for both genomes
  • Search for another E. coli genome by typing "MG1655" into one of the Organism search boxes
    • The one auto-select from that search will be perfect for the analysis (ID 4242)
  • Scroll to the bottom of the page and press the red button "Generate SynMap"
  • When the analysis is finished, press "Go" to see the results
  • Click on the dotpot to get a zoomed-in version of the dotplot.
  • Scroll onto the green line in the dotplot and double click when the cross-hairs turn red to launch GEvo for microsynteny analysis
  • Press "Run GEvo" to run GEvo
  • Note: This link will run a SynMap analysis for E. coli K12 substrains MG1655 and DH10B: https://genomevolution.org/r/daqz

Your History and Activity

  • If you are logged into CoGe, CoGe will record your activities. These are available for review in your User Profile page: https://genomevolution.org/CoGe/User.pl
  • Click on "Activity" in the menu on the left. This will give you an overview of the number of analyses you've run
  • Your previously run analyses can be viewed by clicking "Analyses". Clicking on an analysis will re-run it.
  • Your previously loaded data can be viewed by clicking "Data loading". Clicking on a previously loaded data will open the detailed view for those data.


Comparative Genomics

  • Arabidopsis thaliana v. Arabidopsis lyrata (Synonymous values): http://genomevolution.org/r/d7e7
    • Go to SynMap: https://genomevolution.org/CoGe/SynMap.pl
    • Search for "Col-0" and "lyrata" in each of the Organism search boxes.
    • Select the "Analysis Options" tab near the top of the screen
    • Under "CodeML" and next to "Calculate syntenic CDS pairs and color dots", select "Synonymous (Ks)"
    • Press "Generate SynMap"

Data files reference

CoGe Learning Material