Generic workshop

From CoGepedia
Jump to: navigation, search

genomevolution.org

Introduction

  1. Who has used CoGe?
  2. Preamble:
    1. Store any verison of any genome from all of life
    2. Interconnected tools to analyze genomes at multiple levels of resolution
    3. Emphasis on exploring genomes as a biologist would an organism
  3. General types of research questions:
    1. I am interested in a group of organisms. . .
    2. I am interested in a group of genes . . .
    3. CoGe has tools to help answer questions in light of genome structure, dynamics, and evolution
  4. Who are you?
    1. Name
    2. Genes and genomes of interest
    3. Anything particular you'd like to know by the end of the workshop
  5. Workshop Organization
    1. Overview of CoGe's tools using example analyses to understand how they are linked together
    2. Open QnA

Always ask questions

  1. First, anyone interested in comparing large genomes
    1. CoGe can do large analyses, but depending on the size and complexity of the genomes, some analyses may take a while to run. However, CoGe caches the results of large analyses.
  2. Home Page
    1. Info on the system
    2. Entrance tools
    3. Where to get more help

Introduction to some of the main tools

  1. OrganismView: Find genomes and getting an overview of genomic data
    1. Start with bacteria genomes: small and fast to process, easier to visualize comparisons.
  2. GenomeView: Visually inspecting genomes: http://genomevolution.org/CoGe/GenomeView.pl?z=6&x=10000&dsgid=4242&chr=1
    1. MG1655 and horizontal genome transfer (phage insertion at position 280,000)
    2. Use browser layer "Wobble GC usage" to visualize"
    3. Extract sequence and features
    4. Get annotations for feature list: http://genomevolution.org/CoGe/FeatList.pl?dsid=36725&chr=1&start=252944&stop=305680&gstid=1
  3. SynMap: Pair-wise whole genome comparison; syntenic dotplots
    1. E. coli DH10B and W3110: http://genomevolution.org/r/2vde
    2. What is the dotplot?
    3. Inversion
    4. Segmental duplications
    5. Insertions/deletions
    6. Saving analyses using links: "Regenerate this analysis . . ."
    7. Uncovering evolution: What happened at the central insertion?
      1. High resolution analysis with GEvo
      2. Extracting inserted region (SeqView)
      3. Extracting genomic features and annotations (FeatList)
      4. Adding another sequence to the region from NCBI: http://genomevolution.org/r/2ved
  4. Pick a sequence from around that region to explore a question that uses CoGeBlast:
    1. Number and location of transposons in the genome?
    2. Other genomes with insertion?
  5. Bacterial inversion
    1. Sequences involved with inversion
      1. http://genomevolution.org/r/2vmm
      2. http://genomevolution.org/r/2vml
      3. Merging GEvo analysis: http://genomevolution.org/r/2vmn
    2. X-alignments
    3. Crazy bacterial genome evolution or poor genome assembly: http://genomevolution.org/r/2vmo

Analyzing larger genomes with SynMap:

  1. Human Chimp: http://genomevolution.org/r/4hs9
    1. Changing chromosome order on axis: http://genomevolution.org/r/4hsa
    2. Showing all matches versus just syntenic matches: http://genomevolution.org/r/4hsb
  2. Human Mouse: http://genomevolution.org/r/4hsc
    1. What are those other "dots": measuring evolutionary distance with synonymous mutation rates (Ks): http://genomevolution.org/r/4hsd (Importance of changing Ks color scheme)
    2. Merging GEvo analyses; Human-Chimp-Mouse
      1. Human-Chimp GEvo: http://genomevolution.org/r/2vj3
      2. Human-Mouse GEvo: http://genomevolution.org/r/2vj1
        1. Reverse complementing and masking non-CDS sequences!
      3. Merge: http://genomevolution.org/r/2vjc
  3. Arabidopsis thaliana v Arabidopsis lyrata: http://genomevolution.org/r/2veh
    1. Axis metrics: genes versus nucleotides
    2. Multiple coverage and Whole genome duplications events
    3. Quota-align and setting coverage limits
    4. Synonymous mutations
  4. Sorghum versus Maize: http://genomevolution.org/r/2vej
    1. Shared versus independent Whole genome duplications
  5. Rice versus Brachypodium: http://genomevolution.org/r/2vii
    1. Nested chromosome insertions
  6. Other large genomes?

Gene families

  1. CoGe Blast and identifying families
    1. Find a gene of interest in CoGe or get a sequence from elsewhere
    2. CoGeBlast to various plant genomes
      1. Use CoGeBlast to evaluate hits
      2. Select matching genome features to:
        1. Get sequences
        2. Send to http://phylogeny.fr for phylogenetic tree reconstruction
        3. Send to FeatList to manage list of genomic features.
    3. Use phylogenetic tree and FeatList to select and send genomic features to GEvo to analyze regions for evidence of synteny and classify according to:
      1. Orthologs
      2. Homeologs
      3. Transposition duplications
      4. Tandem duplications

So you have sequenced a genome and you have pile of conigs. . .

  1. Syntenic path assembly in SynMap
    1. WGS sequence and de novo assembly of E. coli K12 to reference genome:
      1. Unsorted SynMap: http://genomevolution.org/r/2vjm
      2. Syntenic Path Assembly: http://genomevolution.org/r/2vjp
    2. Print out assembled sequence. Reload into CoGe. Gene model predictions. Lift-over annotations
  2. Something went wrong: When a sequencing sample was mixed up.
    1. OrganismView: http://genomevolution.org/CoGe/OrganismView.pl?dsgid=11499
    2. No SynMap to reference genome:http://genomevolution.org/CoGe/SynMap.pl?dsgid1=11499;dsgid2=782
    3. What is it: Link to NCBI blast through CoGeBlast: contig00001 hits E. coli
    4. SynMap with E. coli: http://genomevolution.org/r/2v88
      1. Lower syntenic region identification threshold: http://genomevolution.org/r/2vjz
      2. Use syntenic path assembly to see it is an E coli genome: http://genomevolution.org/r/2vk1

Advanced functionality

  1. GC content shifts using SynMap and Amino Acid/Codon log odds scoring matrices
    1. Human-mouse example:
      1. SynMap: http://genomevolution.org/r/2viw
      2. Substitution Matrix:
    2. Plasmodia example:
      1. SynMap: http://genomevolution.org/r/2vk9
      2. Substitution Matrix: http://genomevolution.org/CoGe/SynSub.pl?dsgid1=9636;dsgid2=2465 (Take a while to load)
  2. Detecting mitochondria insertion in Arabidopsis thaliana:
    1. SynMap: http://genomevolution.org/r/2vk8
    2. GEvo: http://genomevolution.org/r/2vke
  3. Auto-finding syntenic regions with SynFind:
    1. http://genomevolution.org/r/2v2b

Where to Learn More

CoGe's Tutorials

  • How-to articles:
    • Maydica: Several workflows and analyses using the genome of maize.