OrganismView

From CoGepedia
Revision as of 11:44, 8 March 2012 by Elyons (Talk | contribs) (Organization and Information)

Jump to: navigation, search

OrganismView is CoGe's tool for searching for the genome of an organism of interest, and getting an overview of genomic information

Introduction

CoGe is designed to store multiple versions of any genome from multiple organisms from all domains of life in any state of assembly and annotation. This includes bacteria, archaea, eukaryotes, organelles, viruses, and sub-genomes such as plasmids. The genomic sequence can also exist in different states such as being partially assembled, fully assembled, completely unmasked, masked for repeats, etc. Also, there can exist different sets of genomic features and annotations that. OrganismView allows users to get detailed information about the genomes available for a given organism, and provides links to other tools in CoGe to extract and visualize various types of genomic information.

Getting Started

How OrganismView appears when first loaded. You can search for you organism by name (Genus species) or by description (Linnaean lineage).

Most organisms in CoGe use the scientific binomen (i.e. Genus species; e.g. Escherichia coli) for their name and full Linnaean lineage for their description (e.g. Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia).

To search for an organism, type any part of their name or description in OrganismView's "Organism Name" or "Organism Description" search box respectively. OrganismView will start searching for anything that matches and displays those organisms in a selectable list below the header "Organisms:". The small number next to the "Organisms:" header is the count of the number of organisms whose name or description matched your search term. Next, just scroll through the list and select your organism. Information about it will start to automatically appear in the other sections of OrganismView.

Organization and Information

Searching for organisms whose name contains 'arabid'.

When an organism is selected, various types of information are shown in varying degrees of scope (listed largest to smallest):

  • Organism: top level list of organisms
  • Genome:whole genome information. For those interested in CoGe's database, this refers to the dataset group table.
  • Dataset: a given genome is comprised of one or more datasets. Different genomic resources organism genomic information differently and this allows for a representation of how an organism's genome was acquired. For example, each chromosome may come from a separate data file.
  • Chromosome: the list of chromosomes for a selected dataset.

OrganismView is organized such that the above information is listed from the top to the bottom of the screen. Each scope level is organized such that selectable lists for the scope is shown on the left of the screen, and information about the selection is shown to the right.

Organism Information

Shows the name and description for the selected organism.

Genome Information

Overview of the genome:

  • Chromosome count (will be very high for partially assembled genomes)
  • Sequence type: Unmasked sequence, masked sequence
  • Total length: For all datasets making up this genome which may include plasmids, organelles, etc. depending on how the "genome" was defined by whomever sequenced the genome. This will automatically calculate the percent GC for genomes smaller than 10 megabases, otherwise the user can click on a link to calculate percent GC content.
  • Non-coding sequence: A link that will calculate the length and GC content of non-protein coding sequence
  • Download sequence in Fasta format: Download the entire genome's sequence
  • Download GFF file: Download all the genomic features and their annotations in GFF format
    • GFF CDS; Names Only: Only extracts gene, mRNA, and CDS features. No annotations are included.
    • GFF CDS; with Annotations: Only extracts gene, mRNA, and CDS features. Annotations are included.
    • GFF All; Names Only: Extracts all features. No annotations are included.
    • GFF All; with Annotations: Extracts all features. Annotations are included.
  • Click for Features: Link to generate a summary table of all features in the genome that will be displayed below the "Chromosome information" in a feature list
  • OrganismView Link: This URL will regenerate the page with the selected genome pre-loaded. Useful for saving the information for later or sending it to someone else

Additional Genome Options

  • Add to Genome List: If you click this button, the genome will be added to a list of genomes. A popup box will appear with your current list of genomes.
  • Owner functions: This buttons will appear if you are the owner of the genome
    • Make Genome Private: Remove the genome from public view. Only users in user groups which have access to the genome may see the genome.
    • Make Genome Public: Makes the genome viewable by anyone.
    • Edit Genome Info: allows an owner of the genome to change the name, description, version of the genome. May also add a message to be displayed when the genome is viewed as well as specify a link to additional information about the genome. When this button is pressed, a popup dialog box will appear to allow a user to modify this information.

Dataset Information

Chromosome Information

Genomic Data

GC content

  • Total
  • Non-coding

Feature Lists

Links

Genome Viewer

Get Sequence

Linking to OrganismView

It is relatively easy to link directly into OrganismView to search for an organism or retrieve a specific organism. Please see Linking to OrganismView for more information.