2011 Berkeley Workshop: Difference between revisions

From CoGepedia
Jump to navigation Jump to search
No edit summary
No edit summary
Line 62: Line 62:
#'''Other large genomes?'''
#'''Other large genomes?'''


#Gene families
'''Gene families'''
##CoGe Blast and identifying families
# CoGe Blast and identifying families
## Find a gene of interest in CoGe or get a sequence from elsewhere
## CoGeBlast to various plant genomes
### Use CoGeBlast to evaluate hits
### Select matching genome features to:
#### Get sequences
#### Send to http://phylogeny.fr for phylogenetic tree reconstruction
#### Send to FeatList to manage list of genomic features.
## Use phylogenetic tree and FeatList to select and send genomic features to GEvo to analyze regions for evidence of synteny and classify according to:
### Orthologs
### Homeologs
### Transposition duplications
### Tandem duplications
 
'''So you have sequenced a genome and you have pile of conigs. . .'''
# Syntenic path assembly in SynMap
## WGS sequence and ''de novo'' assembly of E. coli K12 to reference genome:
### Unsorted SynMap: http://genomevolution.org/r/2vjm
### Syntenic Path Assembly: http://genomevolution.org/r/2vjp
## Print out assembled sequence.  Reload into CoGe.  Gene model predictions.  Lift-over annotations
# Something went wrong: When a sequencing sample was mixed up.
##
##

Revision as of 21:06, 18 April 2011

Introduction

  1. Who has used CoGe?
  2. Preamble:
    1. Store any verison of any genome from all of life
    2. Interconnected tools to analyze genomes at multiple levels of resolution
    3. Emphasis on exploring genomes as a biologist would an organism
  3. General types of research questions:
    1. I am interested in a group of organisms. . .
    2. I am interested in a group of genes . . .
    3. CoGe has tools to help answer questions in light of genome structure, dynamics, and evolution
  4. Who are you?
    1. Name
    2. Genes and genomes of interest
    3. Anything particular you'd like to know by the end of the workshop
  5. Workshop Organization
    1. Overview of CoGe's tools using example analyses to understand how they are linked together
    2. Open QnA

Always ask questions

  1. First, anyone interested in comparing large genomes
    1. CoGe can do large analyses, but depending on the size and complexity of the genomes, some analyses may take a while to run. However, CoGE caches the results of large analyses.
  2. Home Page
  3. OrganismView: Find genomes and getting an overview of genomic data
    1. Start with bacteria genomes: small and fast to process, easier to visualize comparisons.
  4. GenomeView: Visually inspecting genomes: MG1655 and horizontal genome transfer
  5. SynMap: Pair-wise whole genome comparison; syntenic dotplots
    1. E. coli DH10B and W3110: http://genomevolution.org/r/2vde
    2. What is the dotplot?
    3. Inversion
    4. Segmental duplications
    5. Insertions/deletions
    6. Saving analyses using links: "Regenerate this analysis . . ."
    7. Uncovering evolution: What happened at the central insertion?
      1. High resolution analysis with GEvo
      2. Extracting inserted region (SeqView)
      3. Extracting genomic features and annotations (FeatList)
      4. Adding another sequence to the region from NCBI: http://genomevolution.org/r/2ved
  6. Pick a sequence from around that region to explore a question that uses CoGeBlast:
    1. Number and location of transposons in the genome?
    2. Other genomes with insertion?

Analyzing larger genomes with SynMap:

  1. Human Chimp: http://genomevolution.org/r/2vik
    1. Changing chromosome order on axis: http://genomevolution.org/r/2vin
    2. Showing all matches versus just syntenic matches: http://genomevolution.org/r/2vis
  2. Human Mouse: http://genomevolution.org/r/2vip
    1. What are those other "dots": measuring evolutionary distance with synonymous mutation rates:
    2. Merging GEvo analyses; Human-Chimp-Mouse
      1. Human-Chimp GEvo: http://genomevolution.org/r/2vj3
      2. Human-Mouse GEvo: http://genomevolution.org/r/2vj1
        1. Reverse complementing and masking non-CDS sequences!
      3. Merge: http://genomevolution.org/r/2vjc
  3. Arabidopsis thaliana v Arabidopsis lyrata: http://genomevolution.org/r/2veh
    1. Axis metrics: genes versus nucleotides
    2. Multiple coverage and Whole genome duplications events
    3. Quota-align and setting coverage limits
    4. Synonymous mutations
  4. Sorghum versus Maize: http://genomevolution.org/r/2vej
    1. Shared versus independent Whole genome duplications
  5. Rice versus Brachypodium: http://genomevolution.org/r/2vii
    1. Nested chromosome insertions
  6. Other large genomes?

Gene families

  1. CoGe Blast and identifying families
    1. Find a gene of interest in CoGe or get a sequence from elsewhere
    2. CoGeBlast to various plant genomes
      1. Use CoGeBlast to evaluate hits
      2. Select matching genome features to:
        1. Get sequences
        2. Send to http://phylogeny.fr for phylogenetic tree reconstruction
        3. Send to FeatList to manage list of genomic features.
    3. Use phylogenetic tree and FeatList to select and send genomic features to GEvo to analyze regions for evidence of synteny and classify according to:
      1. Orthologs
      2. Homeologs
      3. Transposition duplications
      4. Tandem duplications

So you have sequenced a genome and you have pile of conigs. . .

  1. Syntenic path assembly in SynMap
    1. WGS sequence and de novo assembly of E. coli K12 to reference genome:
      1. Unsorted SynMap: http://genomevolution.org/r/2vjm
      2. Syntenic Path Assembly: http://genomevolution.org/r/2vjp
    2. Print out assembled sequence. Reload into CoGe. Gene model predictions. Lift-over annotations
  2. Something went wrong: When a sequencing sample was mixed up.