Here you can find tutorials informing you on how to get the most out of CoGe's tools.
- 1 Original tutorials
- 2 Tutorials
- 2.1 How to search and assemble genes
- 2.2 How to assemble contig-level de novo assembly using a reference genome using synteney
- 2.3 How to determine the structural changes between genome assemblies
- 2.4 How to do phylogenetics with CoGe
- 2.5 How to download a genome and its annotations
- 2.6 How to find syntenic regions between genomes
- 2.7 What to do with a genome in the early stages of assembly
- 2.8 Finding Inversions
- 2.9 Ortholog identification and conserved noncoding sequence (CNS) analysis
- 2.10 How to find rarely and frequently used codons in a genome
- 2.11 How to generate an amino acid usage table for an organism
- 2.12 How to determine the GC content of a genome or chromosome
- 2.13 Whole Genome Comparison and Analysis using SynMap and GEvo
- 2.14 How to extract all the gene sequences from a genomic region for export from CoGe
- 2.15 Identifying putative horizontal gene transfer events
- 2.16 How to find the size of a genome and the genomic features it contains
- 2.17 How to annotate a genome using CoGe's tools and links to other bioinformatic resources
- 2.18 How to load genomes into CoGe
- 2.19 How to perform a genomic rearrangement analysis
- 2.20 How to share genomes
- 2.21 Sharing data in iPlant's Data Store
- 2.22 Sending genomes to the iPlant Data Store
- 2.23 How to add a genome from Phytozome or JGI
- 2.24 How to favorite things (genomes, experiments) in CoGe so they appear at the top of lists in tools
- 3 Tutorials and lessons for high-school students
- 4 Published Papers (focused on using CoGe)
- 5 Workshops
- 5.1 SIP2010
- 5.2 2012 JCVI CoGe Plant Bioinformatics Workshop
- 5.3 2012 USDA Maricopa CoGe Plant Bioinformatics Workshop
- 5.4 2014 JCVI Summer Genomics Workshop
- 5.5 2015 PAG Computer Demo
- 5.6 2015 Plant Genome Evolution Workshop
- 5.7 2015 University of Chile
- 5.8 2016 Plant Reproduction
- 5.9 2016 Arizona State University
- 5.10 2016 Intro to Genome Management and Analysis with CoGe
- 6 Genomic Web Resources Linking to CoGe
- 7 Video Tutorials
- 7.1 OrganismView
- 7.2 FeatList
- 7.3 SeqView
- 7.4 MaizeGDB and CoGe's Maize-Sorghum Orthologies
- 7.5 Using CoGe and phylogeny.fr to quickly find homologs and build phylogenetic trees
- 7.6 Human-Chimp Whole Genome Comparison
- 7.7 Using GEvo to find a genomic region of interest, and extracting its sequence and genomic features
- 7.8 Using SynMap to compare two strains of Bacillus thuringiensis and characterizing the breakpoints of an inversion (which turns out to have a chance of being due to a genome assembly error
- 7.9 Using OrganismView and SynMap to find and compare closely related genomes
- 7.10 Comparing the chloroplast genomes of maize and sorghum
- 7.11 How to verify and compare maize gene models
- 7.12 How to visualize assembly and annotation changes between versions of the maize genome
- 7.13 How to compare different assembly versions of a genome
- 7.14 How to generate orthology gene lists between maize and another grass (e.g. Sorghum)
- 7.15 How to use CoGe's polymorphism tables for validating identified polymorphisms
- 7.16 How to use the iPlant Data Store to generate a quick-share link to share a genome file
- 7.17 How to use the CoGe and SSWAP (iPlant's Semantic Web Portal: http://sswap.iplantcollaborative.org/)
- 7.18 Using CoGe's tools SynMap and GEvo to compare the genomes of two Phytophthora species
- 7.19 Using CoGe's tool SynFind to identify sytnenic regions across multiple genomes. This example uses Phytophthora species.
- 7.20 Using CoGe's tool SynMap to compare two genomes of E. coli and analyze a syntenic discontinuity in more detail with GEvo
- 7.21 Using CoGe's tools SynFind and SynMap to find a mark a syntenic gene pair in a syntenic dotplot
- 7.22 Using user-data management system to share data
- 7.23 Using the iPlant Datastore to add your data to CoGe
- 7.24 How to load experimental data into CoGe
- 7.25 How to load a private genome with annotations and experimental data, and view in JBrowse in under three minutes
- 7.26 How to RNA-Seq in CoGe in three minutes
- 7.27 EPIC-CoGe/GenomeView videos
You can find a list of CoGe's old tutorials here.
How to search and assemble genes
Contributed by David Nelson
How to assemble contig-level de novo assembly using a reference genome using synteney
How to determine the structural changes between genome assemblies
- This is easy using SynMap. Just find your organism of interest, and select the two versions of its genome your wish to compare. Here are examples form:
- Grape genome versions 1 and 2
- Medicago genome versions 2 and 3
- Maize B73 refgen version 1 (with gene annotation) and version 2 (genomic sequence only)
How to do phylogenetics with CoGe
- You have a sequence of interest and you want to find homologs of it within and among various genomes in order to do phylogenetic tree reconstructions. CoGe can help. CoGeBlast helps you identify and evaluate homologs from any number of genome, and is linked to FeatList for displaying information about a list of genomic features. FeatList plays a central role in managing lists of genomic features in CoGe and let's you select and send features to other programs in CoGe. One of them is FastaView for generating fasta formatted sequence data. FastaView is linked to phylogeny.fr, and web resource for generating multiple sequence alignments and phylogenetic tree reconstructions. They have a very nice pipeline for automatically generating a decent phylogeny for a set of sequences, and FastaView's link to phylogeny.fr will automatically submit your sequences to their 'one-click' phylogenetic pipeline.
Here is the full tutorial: phylogenetics in CoGe
How to download a genome and its annotations
- This is easy using OrganismView. Just search for an organism and genome of interest and use the following links found in the "Genome information" section:
- "Download sequence in Fasta format" to download the entire genome's DNA sequence in fasta format
- "Download GFF file" to download all the genomic features in the genome and their annotations in GFF format
How to find syntenic regions between genomes
What to do with a genome in the early stages of assembly
- Bacteria Genomic Inversion E .coli K12 This example shows an inversion between substrains of Escherichia coli K12, DH10B and W3110.
- Bacteria Genomic Inversion Shewanella baltica This example shows an inversion between strains of Shewanella baltica, OS155 and OS185.
- X-alignments: Due to most bacterial inversions occurring symmetrically around the origin of replication, they create "X"-like patterns in syntenic dotplots.
Ortholog identification and conserved noncoding sequence (CNS) analysis
- Ramosa2 orthologs and CNSs: ramosa2 Encodes a LATERAL ORGAN BOUNDARY Domain Protein That Determines the Fate of Stem Cells in Branch Meristems of Maize  Special thanks to Devin O’Connor for writing this tutorial!
- Esteban Bortiri, George Chuck, Erik Vollbrecht, Torbert Rocheford, Rob Martienssen, and Sarah Hake. 2006 ramosa2 Encodes a LATERAL ORGAN BOUNDARY Domain Protein That Determines the Fate of Stem Cells in Branch Meristems of Maize. Plant Cell 18:574–585
How to find rarely and frequently used codons in a genome
- This is very straight forward using OrganismView. Just find your organism and generate a codon usage table.
How to generate an amino acid usage table for an organism
- This is built into OrganismView and only takes a couple of clicks to generate one. The steps are nearly identical to that for generating a codon usage table.
How to determine the GC content of a genome or chromosome
- This is easy using OrganismView. Just search for an organism and genome of interest and press the link "Click for percent GC content" located next to the length of the genome in the "Genome information" section. For small genomes, this is automatically calculated when the "Genome information" section is loaded.
- Analysis of variations found in genomes of Escherichia coli strain K12 DH10B and strain B REL606 using SynMap and GEvo analysis
- Maize Sorghum Syntenic dotplot Since these lineages diverged ~11 MYA, maize has had a whole genome duplication event; prior to their divergence the lineage had a whole genome duplication event. Using SynMap's synonymous mutation overlay, it is easy to determine which syntenic regions are derived from the pre-grass whole genome duplication event or the one specific to maize.
How to extract all the gene sequences from a genomic region for export from CoGe
- There are times when you want to export sequences from CoGe to another informatics tool. CoGe makes it easy to find the sequences you want and format them for export: How to extract genomic features.
Identifying putative horizontal gene transfer events
- There are several genomic characteristics that can be used to identify genomic regions that are derived from a horizontal gene transfer event. One is the by looking for CDSs that have a different codon usage pattern than neighboring genomic sequence. GenomeView provides a visual way to identify such anomalous genomic regions through one of its genomic visualization layers that colors CDSs based on the GC content of the codon wobble positions: Detecting horizontal gene transfer using GenomeView.
How to find the size of a genome and the genomic features it contains
- Want to know how big a genome is and get a breakdown of the number of genes it contains? Use OrganismView and just search for your organism of interest. It will automatically return these sets of information, and a whole lot more.
This tutorial uses CoGe to annotate a Baculovirus genome. Here, CoGe is primarily used to retrieve and organize genomic sequence data, and use its built-in links are used to:
- Build an Excel spreadsheet of all genomic features in a genome including their annotations and sequences.
- Link to NCBI's Blast resource for sequence searching
- Send protein sequences to ProSite for domain identification
The steps outlines here will map to any genome, and works very well for a class project!
Thanks to Dr. Eric Haas-Stapleton for creating this tutorial.
How to load genomes into CoGe
More people are requesting installing a local version of CoGe. For those of you you have one, or access to the main CoGe server, here are the directions on How to load a genome into CoGe.
How to perform a genomic rearrangement analysis
When configured to enforce a syntenic coverage depth of 1:1 (a one to one mapping of syntenic regions between two genomes), SynMap will generate a link to the genomic rearrangement analysis tool, GRIMM, and auto-populate its submission boxes with the analyzed genomes appropriately formatted.
Please see this tutorial: Genomic Rearrangement Analysis
This is a tutorial on how to share genomes with other users in CoGe
This tutorial shows to to use the iPlant Data Store to share data. This is useful if you want to share data with the CoGe Team to help get your genome loaded into CoGe
This tutorial shows how to send a genome from CoGe to the iPlant Data Store (including marking up those data with all the metadata CoGe has on the genome.
This tutorial walks through all the steps to integrate a genome from Phytozome/JGI into CoGe
CoGe often has many versions of a genome and this shows you how mark your favorite so it always shows up on top.
Tutorials and lessons for high-school students
High-school student tutorials: These tutorials were designed in conjunctions with Michael Nakashima.
Published Papers (focused on using CoGe)
These papers are put forth as complete tutorials with background information as to how to use CoGe to perform various tasks:
Maize genome analysis: Maydica
Download Open Access Article From Maydica: http://www.maydica.org/articles/56_183.pdf
Alternative Download from CoGe: http://genomevolution.org/r/4stu
Comparative genomics with maize and other grasses: from genes to genomes!
James C. Schnable and Eric Lyons
Abstract: Of all the major plant groups, the grasses, with the complete genomes of five species, are the best positioned to take advantage of comparative genomics to obtain insight into functional genetic elements. Of all the grasses, maize is the best characterized in terms of genetics, development, and evolution. We provide several examples of how the web-based comparative genomics system CoGe may be used to aid in the interpretation of the maize genome sequence. These examples include verifying gene models, identifying differences between genome as- semblies, identifying conserved non-coding sequences, identifying syntenic regions between species and poly- ploidies, and identifying homeologs within maize and orthologs between maize and other grass genomes. In addition, a comprehensive list of orthologous gene sets is provided between maize and Sorghum, foxtail millet, rice, and Brachypodium.
Brassica genome analysis: Frontiers in Plant Genetics and Genomics
Download Open Access Article: http://www.frontiersin.org/plant_genetics_and_genomics/10.3389/fpls.2012.00172/abstract
Unleashing the genome of Brassica rapa
Haibao Tang and Eric Lyons
Abstract: The completion and release of the Brassica rapa genome is of great benefit to researchers of the Brassicas, Arabidopsis, and genome evolution. While its lineage is closely related to the model organism Arabidopsis thaliana, the Brassicas experienced a whole genome triplication subsequent to their divergence. This event contemporaneously created three copies of its ancestral genome, which had diploidized through the process of homeologous gene loss known as fractionation. By the fractionation of homeologous gene content and genetic regulatory binding sites, Brassica’s genome is well placed to use comparative genomic techniques to identify syntenic regions, homeologous gene duplications, and putative regulatory sequences. Here, we use the comparative genomics platform CoGe to perform several different genomic analyses with which to study structural changes of its genome and dynamics of various genetic elements. Starting with whole genome comparisons, the Brassica paleohexaploidy is characterized, syntenic regions with A. thaliana are identified, and the TOC1 gene in the circadian rhythm pathway from A. thaliana is used to find duplicated orthologs in B. rapa. These TOC1 genes are further analyzed to identify conserved non-coding sequences that contain cis-acting regulatory elements and promoter sequences previously implicated in circadian rhythmicity. Each “cookbook style” analysis includes a step-by-step walk-through with links to CoGe to quickly reproduce each step of the analytical process.
- Announcement: http://www.plantgenomeevolution.com/coge-workshop.asp
Genomic Web Resources Linking to CoGe
MaizeGDB links to CoGe through its genome browser to help researchers find syntenic gene sets between maize and sorghum.
OrganismView is CoGe's tool for finding genomes for your organism of interest.
FeatList is CoGe's tool for managing lists of genomic features.
SeqView is CoGe's tool for generating primary sequence data in fasta format.
MaizeGDB and CoGe's Maize-Sorghum Orthologies
Researchers can now go directly from MaizeGDB's genome browser to view the same region within CoGe's GenomeView and quickly compared pre-called syntenic orthologous genes between maize and sorghum, as well as the homeologous gene in maize, or when no homeolog was found, the homeologous region in which we would have expected to find it.
Using CoGe and phylogeny.fr to quickly find homologs and build phylogenetic trees
CoGe's tools make it easy to search through genomes to find homologs to a sequence of interest. Once identified, these sequences can be manipulated in FastaView and sent to phylogeny.fr for multiple sequence alignment, phylogenetic tree reconstruction, and tree visualization.
You can download a high-resolution version of the video from http://genomevolution.com/CoGe/docs/video/CoGe-Phylogenetics.mov
Human-Chimp Whole Genome Comparison
This tutorial walks through using SynMap to do a whole genome comparison between human and chimp.
You can download a high-resolution version of the video from http://genomevolution.com/CoGe/docs/video/SynMap-human-chimp.mov
Using GEvo to find a genomic region of interest, and extracting its sequence and genomic features
You can download a high-resolution version of the video from http://genomevolution.com/CoGe/docs/video/GEvo-to-extract-sequence-features.mov
Using SynMap to compare two strains of Bacillus thuringiensis and characterizing the breakpoints of an inversion (which turns out to have a chance of being due to a genome assembly error
You can download a high-resolution version of the video from http://genomevolution.com/CoGe/docs/video/Bacillus_thuringiensis_SynMap-dotplot.mov
Comparing the chloroplast genomes of maize and sorghum
This video walks through comparing the genomes of maize and sorghum chloroplasts to identify individual polymorphisms/character states.
How to verify and compare maize gene models
Using walkthrough 1 from the Maydica CoGe article
How to visualize assembly and annotation changes between versions of the maize genome
Using walkthrough 2 from the Maydica CoGe article
How to compare different assembly versions of a genome
Using walkthrough 5 from the Maydica CoGe article
This example uses SynMap to compare the assembly differences between version 1 and version 2 of the maize genome.
How to generate orthology gene lists between maize and another grass (e.g. Sorghum)
Using walkthrough 6 from the Maydica CoGe article
A difficult in identifying orthologous genes among the grasses is the grass-specific whole genome duplication event that happened prior to the radiation of all the grass lineages. The problem is compounded for maize due to its lineage-specific whole genome duplication event. When comparing the genomes of maize and sorghum, each regions of the sorghum genome is orthologously syntenic to two regions of the maize genome and paralogously syntenic to two additional regions. This video walks through comparing the genomes of maize and sorghum using SynMap, and its various advanced analytical tools to identify orthologous syntenic regions by the relative evolutionary distance of syntenic gene pairs using synonymous mutation rates and the algorithm quota align for screening syntenic regions to enforce a specific mapping of syntenic regions between genomes.
For more information on the evolutionary history of the maize and sorghum genomes: Maize Sorghum Syntenic dotplot
How to use CoGe's polymorphism tables for validating identified polymorphisms
How to use the CoGe and SSWAP (iPlant's Semantic Web Portal: http://sswap.iplantcollaborative.org/)
Using CoGe's tool SynFind to identify sytnenic regions across multiple genomes. This example uses Phytophthora species.
Using CoGe's tool SynMap to compare two genomes of E. coli and analyze a syntenic discontinuity in more detail with GEvo
Using CoGe's tools SynFind and SynMap to find a mark a syntenic gene pair in a syntenic dotplot
Using the iPlant Datastore to add your data to CoGe
How to load experimental data into CoGe
This short video shows how to add a SNP data to the Arabidopsis genome from the 1001 Arabidopsis genome project
How to load a private genome with annotations and experimental data, and view in JBrowse in under three minutes
This short video shows how to:
- Load a genome from a fasta file
- Load structural annotations from a GFF file
- Load variation (SNP) data from a vcf file
- Load mapped reads from a BAM file
- Load expression data from a CSV file
How to RNA-Seq in CoGe in three minutes
This short video shows how to load, process, and visualize RNA-Seq data in CoGe:
- RNA-Seq Pipeline: Expression Analysis Pipeline
- Quantitation of reads per position along a genome
- Quantitation of FPKM for genomes with structural gene annotations
- Individually mapped reads
- All visualized in the EPIC-CoGe Browser (Based on JBrowse)