2011 BSA Workshop: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
|||
Line 18: | Line 18: | ||
## Overview of CoGe's tools using example analyses to understand how they are linked together | ## Overview of CoGe's tools using example analyses to understand how they are linked together | ||
## Open QnA | ## Open QnA | ||
==Always ask questions== | |||
#[[FAQs#What_is_needed_to_run_CoGe.3F | What you need to run CoGe]]: | #[[FAQs#What_is_needed_to_run_CoGe.3F | What you need to run CoGe]]: | ||
##Firefox | ##Firefox | ||
Line 39: | Line 40: | ||
##Entrance Tools | ##Entrance Tools | ||
##Connected Tools | ##Connected Tools | ||
==Starting with CoGe== | |||
# Home Page: http://genomevolution.org/CoGe/ | # Home Page: http://genomevolution.org/CoGe/ | ||
## Info on the system: How many | ## Info on the system: How many |
Revision as of 19:35, 6 July 2011
genomevolution.org
Introduction
- Who has used CoGe?
- Preamble:
- Store any version of any genome from all of life
- Interconnected tools to analyze genomes at multiple levels of resolution: Open-ended Analyses!
- Your questions drive where you go and what analyses you perform. Not the tools driving the questions you may ask.
- Emphasis on exploring genomes as a biologist would an organism
- General types of research questions:
- I am interested in a group of organisms. . .
- I am interested in a group of genes . . .
- CoGe has tools to help answer questions in light of genome structure, dynamics, and evolution
- Who are you?
- Name
- Genes and genomes of interest
- Anything particular you'd like to know by the end of the workshop
- Workshop Organization
- Overview of CoGe's tools using example analyses to understand how they are linked together
- Open QnA
Always ask questions
- What you need to run CoGe:
- Firefox
- Flash
- Enable Javascript
- Enable Popups (for CoGe only)
- Enable Cookies (if you have a CoGe user account)
- First, anyone interested in comparing large genomes
- CoGe can do large analyses, but depending on the size and complexity of the genomes, some analyses may take a while to run. However, CoGe caches the results of large analyses.
- CoGe Organization: Each tool is designed to do one thing and one thing well.
- Tools are linked to one another through URL/web-links that pop-up additional tabs.
- This creates an implicit record of each step of your analysis.
- What to save where you are in an analysis? Copy the link and paste into your notes.
- Most analyses generate URLs that you can save to regenerate the analysis.
- Data and analysis results are meant to be easily exported from CoGe.
- Download genomes and annotations
- Download whole genome comparison blast files
- Download syntenic gene-pairs between genomes
- Home Page
- Entrance Tools
- Connected Tools
Starting with CoGe
- Home Page: http://genomevolution.org/CoGe/
- Info on the system: How many
- Organisms
- Genomes
- Nucleotides of genomic sequence
- [Genome features] (e.g. genes, mRNA, CDS, transposon, repeat region)
- Annotations
- Entrance tools
- OrganismView: Search for an organism by name or taxonomic description; get information about that genome. Links to downstream analyses.
- CoGeBlast: Blast sequences against any set of genomes in the system. You search for genomes and add them to a list. Great visuals for evaluating hits; automatic links to matching genomic features for additional downstream data retrieval and analyses
- FeatView: Search for genomic features by name; get information about that genomic feature (annotations, sequences, AT/GC content). Links to downstream analyses.
- SynMap: Pairwise whole genome comparisons with interactive and customizable syntenic dotplots visualizations. Links to downstream analyses.
- GEvo: High-resolution analysis of multiple genomic regions. Dynamic and interactive visualizations. Links to downstream analyses.
- Where to get more help
- [CoGePedia]
- Info on the system: How many
- OrganismView: Find genomes and getting an overview of genomic data
- Start with bacteria genomes: small and fast to process, easier to visualize comparisons.
- Same methods work on any genome (though larger genomes may take longer to process).
- GenomeView: Visually inspecting genomes: http://genomevolution.org/CoGe/GenomeView.pl?z=6&x=10000&dsgid=4242&chr=1
- MG1655 and horizontal genome transfer (phage insertion at position 280,000)
- Use browser layer "Wobble GC usage" to visualize"
- Extract sequence and features
- Get annotations for feature list: http://genomevolution.org/CoGe/FeatList.pl?dsid=36725&chr=1&start=252944&stop=305680&gstid=1
- SynMap: Pair-wise whole genome comparison; syntenic dotplots
- E. coli DH10B and W3110: http://genomevolution.org/r/2vde
- What is the dotplot?
- Inversion
- Segmental duplications
- Insertions/deletions
- Saving analyses using links: "Regenerate this analysis . . ."
- Uncovering evolution: What happened at the central insertion?
- High resolution analysis with GEvo
- Extracting inserted region (SeqView)
- Extracting genomic features and annotations (FeatList)
- Adding another sequence to the region from NCBI: http://genomevolution.org/r/2ved
- Pick a sequence from around that region to explore a question that uses CoGeBlast:
- Number and location of transposons in the genome?
- Other genomes with insertion?
- Bacterial inversion
- Sequences involved with inversion
- http://genomevolution.org/r/2vmm
- http://genomevolution.org/r/2vml
- Merging GEvo analysis: http://genomevolution.org/r/2vmn
- X-alignments
- Crazy bacterial genome evolution or poor genome assembly: http://genomevolution.org/r/2vmo
- Sequences involved with inversion
Analyzing larger genomes with SynMap:
- Human Chimp: http://genomevolution.org/r/2vik
- Changing chromosome order on axis: http://genomevolution.org/r/2vin
- Showing all matches versus just syntenic matches: http://genomevolution.org/r/2vis
- Human Mouse: http://genomevolution.org/r/2vip
- What are those other "dots": measuring evolutionary distance with synonymous mutation rates (Ks): http://genomevolution.org/r/2viw (Importance of changing Ks color scheme)
- Merging GEvo analyses; Human-Chimp-Mouse
- Human-Chimp GEvo: http://genomevolution.org/r/2vj3
- Human-Mouse GEvo: http://genomevolution.org/r/2vj1
- Reverse complementing and masking non-CDS sequences!
- Merge: http://genomevolution.org/r/2vjc
- Arabidopsis thaliana v Arabidopsis lyrata: http://genomevolution.org/r/2veh
- Axis metrics: genes versus nucleotides
- Multiple coverage and Whole genome duplications events
- Quota-align and setting coverage limits
- Synonymous mutations
- Sorghum versus Maize: http://genomevolution.org/r/2vej
- Shared versus independent Whole genome duplications
- Rice versus Brachypodium: http://genomevolution.org/r/2vii
- Nested chromosome insertions
- Other large genomes?
Gene families
- CoGe Blast and identifying families
- Find a gene of interest in CoGe or get a sequence from elsewhere
- CoGeBlast to various plant genomes
- Use CoGeBlast to evaluate hits
- Select matching genome features to:
- Get sequences
- Send to http://phylogeny.fr for phylogenetic tree reconstruction
- Send to FeatList to manage list of genomic features.
- Use phylogenetic tree and FeatList to select and send genomic features to GEvo to analyze regions for evidence of synteny and classify according to:
- Orthologs
- Homeologs
- Transposition duplications
- Tandem duplications
So you have sequenced a genome and you have pile of conigs. . .
- Syntenic path assembly in SynMap
- WGS sequence and de novo assembly of E. coli K12 to reference genome:
- Unsorted SynMap: http://genomevolution.org/r/2vjm
- Syntenic Path Assembly: http://genomevolution.org/r/2vjp
- Print out assembled sequence. Reload into CoGe. Gene model predictions. Lift-over annotations
- WGS sequence and de novo assembly of E. coli K12 to reference genome:
- Something went wrong: When a sequencing sample was mixed up.
- OrganismView: http://genomevolution.org/CoGe/OrganismView.pl?dsgid=11499
- No SynMap to reference genome:http://genomevolution.org/CoGe/SynMap.pl?dsgid1=11499;dsgid2=782
- What is it: Link to NCBI blast through CoGeBlast: contig00001 hits E. coli
- SynMap with E. coli: http://genomevolution.org/r/2v88
- Lower syntenic region identification threshold: http://genomevolution.org/r/2vjz
- Use syntenic path assembly to see it is an E coli genome: http://genomevolution.org/r/2vk1
Advanced functionality
- GC content shifts using SynMap and Amino Acid/Codon log odds scoring matrices
- Human-mouse example:
- SynMap: http://genomevolution.org/r/2viw
- Substitution Matrix:
- Plasmodia example:
- SynMap: http://genomevolution.org/r/2vk9
- Substitution Matrix: http://genomevolution.org/CoGe/SynSub.pl?dsgid1=9636;dsgid2=2465 (Take a while to load)
- Human-mouse example:
- Detecting mitochondria insertion in Arabidopsis thaliana:
- Auto-finding syntenic regions with SynFind: