2011 Berkeley Workshop
Jump to navigation
Jump to search
genomevolution.org
Introduction
- Who has used CoGe?
- Preamble:
- Store any verison of any genome from all of life
- Interconnected tools to analyze genomes at multiple levels of resolution
- Emphasis on exploring genomes as a biologist would an organism
- General types of research questions:
- I am interested in a group of organisms. . .
- I am interested in a group of genes . . .
- CoGe has tools to help answer questions in light of genome structure, dynamics, and evolution
- Who are you?
- Name
- Genes and genomes of interest
- Anything particular you'd like to know by the end of the workshop
- Workshop Organization
- Overview of CoGe's tools using example analyses to understand how they are linked together
- Open QnA
Always ask questions
- First, anyone interested in comparing large genomes
- CoGe can do large analyses, but depending on the size and complexity of the genomes, some analyses may take a while to run. However, CoGe caches the results of large analyses.
- Home Page
- Info on the system
- Entrance tools
- Where to get more help
- OrganismView: Find genomes and getting an overview of genomic data
- Start with bacteria genomes: small and fast to process, easier to visualize comparisons.
- GenomeView: Visually inspecting genomes: http://genomevolution.org/CoGe/GenomeView.pl?z=6&x=10000&dsgid=4242&chr=1
- MG1655 and horizontal genome transfer (phage insertion at position 280,000)
- Use browser layer "Wobble GC usage" to visualize"
- Extract sequence and features
- Get annotations for feature list: http://genomevolution.org/CoGe/FeatList.pl?dsid=36725&chr=1&start=252944&stop=305680&gstid=1
- SynMap: Pair-wise whole genome comparison; syntenic dotplots
- E. coli DH10B and W3110: http://genomevolution.org/r/2vde
- What is the dotplot?
- Inversion
- Segmental duplications
- Insertions/deletions
- Saving analyses using links: "Regenerate this analysis . . ."
- Uncovering evolution: What happened at the central insertion?
- High resolution analysis with GEvo
- Extracting inserted region (SeqView)
- Extracting genomic features and annotations (FeatList)
- Adding another sequence to the region from NCBI: http://genomevolution.org/r/2ved
- Pick a sequence from around that region to explore a question that uses CoGeBlast:
- Number and location of transposons in the genome?
- Other genomes with insertion?
- Bacterial inversion
- Sequences involved with inversion
- http://genomevolution.org/r/2vmm
- http://genomevolution.org/r/2vml
- Merging GEvo analysis: http://genomevolution.org/r/2vmn
- X-alignments
- Crazy bacterial genome evolution or poor genome assembly: http://genomevolution.org/r/2vmo
- Sequences involved with inversion
Analyzing larger genomes with SynMap:
- Human Chimp: http://genomevolution.org/r/2vik
- Changing chromosome order on axis: http://genomevolution.org/r/2vin
- Showing all matches versus just syntenic matches: http://genomevolution.org/r/2vis
- Human Mouse: http://genomevolution.org/r/2vip
- What are those other "dots": measuring evolutionary distance with synonymous mutation rates (Ks): http://genomevolution.org/r/2viw (Importance of changing Ks color scheme)
- Merging GEvo analyses; Human-Chimp-Mouse
- Human-Chimp GEvo: http://genomevolution.org/r/2vj3
- Human-Mouse GEvo: http://genomevolution.org/r/2vj1
- Reverse complementing and masking non-CDS sequences!
- Merge: http://genomevolution.org/r/2vjc
- Arabidopsis thaliana v Arabidopsis lyrata: http://genomevolution.org/r/2veh
- Axis metrics: genes versus nucleotides
- Multiple coverage and Whole genome duplications events
- Quota-align and setting coverage limits
- Synonymous mutations
- Sorghum versus Maize: http://genomevolution.org/r/2vej
- Shared versus independent Whole genome duplications
- Rice versus Brachypodium: http://genomevolution.org/r/2vii
- Nested chromosome insertions
- Other large genomes?
Gene families
- CoGe Blast and identifying families
- Find a gene of interest in CoGe or get a sequence from elsewhere
- CoGeBlast to various plant genomes
- Use CoGeBlast to evaluate hits
- Select matching genome features to:
- Get sequences
- Send to http://phylogeny.fr for phylogenetic tree reconstruction
- Send to FeatList to manage list of genomic features.
- Use phylogenetic tree and FeatList to select and send genomic features to GEvo to analyze regions for evidence of synteny and classify according to:
- Orthologs
- Homeologs
- Transposition duplications
- Tandem duplications
So you have sequenced a genome and you have pile of conigs. . .
- Syntenic path assembly in SynMap
- WGS sequence and de novo assembly of E. coli K12 to reference genome:
- Unsorted SynMap: http://genomevolution.org/r/2vjm
- Syntenic Path Assembly: http://genomevolution.org/r/2vjp
- Print out assembled sequence. Reload into CoGe. Gene model predictions. Lift-over annotations
- WGS sequence and de novo assembly of E. coli K12 to reference genome:
- Something went wrong: When a sequencing sample was mixed up.
- OrganismView: http://genomevolution.org/CoGe/OrganismView.pl?dsgid=11499
- No SynMap to reference genome:http://genomevolution.org/CoGe/SynMap.pl?dsgid1=11499;dsgid2=782
- What is it: Link to NCBI blast through CoGeBlast: contig00001 hits E. coli
- SynMap with E. coli: http://genomevolution.org/r/2v88
- Lower syntenic region identification threshold: http://genomevolution.org/r/2vjz
- Use syntenic path assembly to see it is an E coli genome: http://genomevolution.org/r/2vk1
Advanced functionality
- GC content shifts using SynMap and Amino Acid/Codon log odds scoring matrices
- Human-mouse example:
- SynMap: http://genomevolution.org/r/2viw
- Substitution Matrix:
- Plasmodia example:
- SynMap: http://genomevolution.org/r/2vk9
- Substitution Matrix: http://genomevolution.org/CoGe/SynSub.pl?dsgid1=9636;dsgid2=2465 (Take a while to load)
- Human-mouse example:
- Detecting mitochondria insertion in Arabidopsis thaliana:
- Auto-finding syntenic regions with SynFind: