Difference between revisions of "MotifView"

From CoGepedia
Jump to: navigation, search
(MotifView Panel)
(Sequence Submission)
Line 66: Line 66:
 
[[Image:Demo-sequence-submission5.png|thumb|500px|right|Sequence submission]]
 
[[Image:Demo-sequence-submission5.png|thumb|500px|right|Sequence submission]]
  
===Sequence Submission===
+
===Sequence Submission (Fig. 1)===
  
 
Enter the genomic region:
 
Enter the genomic region:

Revision as of 16:13, 23 December 2011

MotifView - A motif viewing tool
Motifview image.png

MotifView at work
Software companyCoGe Team
Analysis TypeCompare multiple genomic regions for motifs
Working stateTesting
Tools Utilizedblastn, LAGAN

MotifView is a tool that visualizes motifs in compared genomic regions.


Introduction

MotifView uses visual and algorithmic tools to visualize motifs within multiple genomic regions. Sharing many functional similarities to GEvo, it's possible to compare sequences from any number of organisms using a variety of different sequence comparison algorithms.

On this page we provide only a brief description of options that are shared with GEvo. If descriptions and directions are ambiguous, please follow the links to specific sections to the GEvo instructions on that section.

There's also the option to use the embedded videos to view demos of the sections following. One can either follow the text along with the video or choose to use either exclusively.

MotifView basics

Screen-shot of where a MotifView analysis is configured. Four genomic regions have been specified by gene name, dataset and the amount of additional upstream/downstream sequence
  1. Select genomic regions to analyze
  2. Select a sequence alignment algorithm appropriate for the sequences and area of interest
  3. Select motifs to visualize
  4. Press "Find Motifs!" button

To alternate between these options to configure an analysis, select the appropriate tab.

Sequence Submission

Manual Submission

Select the "Sequence Submission" tab to open these options. Here, you can specify sequence submission boxes for each sequence that will be submitted for a MotifView anlaysis. This is also were you can adjust the amount of sequence analyzed, select which sequences are analyzed, reverse complement a sequence, mask a sequence according the the genomic features it contains, and change the display order of sequences.

The different options for submitting and modifying sequences to be visualized can be found here.

Merging Analyses

Often, there are times when you may want to merge MotifView with previously existing GEvo anlayses. To do this, copy a GEvo link into the text-box next the text: "Merge Previous GEvo Analysis (paste in URL)" located at the top of the sequence submission tab. Then press the "Merge" button". The sequences as specified in the pasted URL will appear as new sequence submission boxes configured as specified in the link (extra up/downstream sequence, reverse complement, masked, etc.)

Alignment Algorithms

While many major algorithms exist for alignment, not all are suitable for the analysis available in MotifView. As such, MotifView compares genes at a scale that makes BlastN and LAGAN the most ideal algorithm choices. The options and suitability of available algorithms is discussed here.

Select Motifs

This tab allows the user to define how and which motifs will be found and analyzed. The "Select Motifs" tab contains four pull down options when choosing motifs for analysis.

Select-motifs.png

Choose TFBS Motif

Users can manually enter a motif in the section "Search for User-Defined Motifs". Also, while colors are automatically provided to motifs, users can define their own color separated from the motif by a colon. For example:

CACGTG:Red

Select from Comprehensive List of Motifs

It's also possible to browse the full list of motifs in our database and add them to be analyzed. In the window presenting the full list, motifs appear by name, then sequence. Information on highlighted motifs will pop up on pressing the "Get Motif Info" button.

Once selected, the motif will appear in the "Selected Motifs" window where they can be additionally deselected or the list cleared entirely for a new list.

Select Motifs from Categories

Additionally, there is a choice of provided motif categories: Stress and Families. Toggling any category will pull a down list of motifs linked to that stress or transcription factor family. If desired, a range of motifs not confined to categories is available below the categories. In addition, users can select or deselect all options in a category if needed.

Once motifs are chosen, press the "Find Motifs!" button above the tabs to begin analysis.

Demo MotifView Analysis

Below is a demo basic MotifView analysis. In it we illustrate how to submit a region to be analyzed, how to choose an algorithm, relevant changes to the graphics, and how to choose motifs.

Sequence submission

Sequence Submission (Fig. 1)

Enter the genomic region:

  1. Enter a gene accession number in the box labeled "Name:". In this case, we've chosen AT3G11580 and AT5G06250, two homeologs with annotations that will be seen later. A list that identifies which datasets contain what annotations can be found here.
  2. Choose datasets to be analyzed. When you enter the accession number, pull down menus will be populated with datasets that contain that gene, including genomic datasets, type of DNA, etc. This example requires that we use Arabidopsis TAIR V8 that has been masked for repeats.

Additionally, you may define how many base pairs flank each genomic region. This will become more relevant when refining an analysis.

Algorithm

Demo-algorithm3.png
  1. Next, the alignment algorithm must be chosen from the pulldown menu next to "Alignment Algorithm:". While many alignment algorithms exist, MotifView analyzes DNA within a very small defined region. As such, this example uses "BlastN" for this analysis since it works best when analyzing small regions.

Results Parameters

There are many options available for ease of use when viewing the analysis. In this example the most relevant options address annotations.

  1. The gene pair we've selected includes annotations for CNSs, gene spaces, and PIL5 sites. As such, we definitely want to see said annotations in the final results so all three boxes are checked.
  2. Further, it's possible to be overwhelmed by the number of motifs present in the imaging panel if a wide selection of motifs is chosen so we chose the option to only view motifs that overlap our annotations.
Demo-results-parameters5.png

Select Motifs

Demo-motif-select-stress2.png

Though users can define their own motifs or select from our full list of motifs as shown above, we're illustrating how categories of motifs can be analyzed.

  1. Toggle "Select Motifs from Stress Categories". The expanding window allows the user to choose from stresses associated with motifs including Chemical/Oxidative/Pathogen, Cold, Drought/Heat, Hypoxia, Light, Nutrient, Salt, Water, and Unspecified stresses.
  2. In this case, toggle the Cold Stress category.
  3. To illustrate how one can search for a range of motifs, Select All motifs in the Cold Stress category.

MotifView Panel

Below is the image of the analysis performed. Show in the panel are:

  1. HSPs: A high-scoring segment pair, or HSP, is a subsegment of a pair of sequences. In this case, the HSPs have been toggled to show the regions of similarity between the gene pair.
  2. Genomic features: The gene is shown with exons painted gold, the introns painted grey and non coding regions painted blue. Notice that that gene space is also highlighted by the yellow background underlying the gene and other annotations
  3. Motifs: These annotations are painted on as diamonds. It's important to realize that the diamonds don't represent the real size of the motifs. Rather, the motifs must be artificially represented or, because of their small size, they won't be visible at all. Notice how the green motif appears to have an HSP associated with a PIL5 site on its homeolog.
  4. CNSs: Conserved non-coding sequences are very prevalent in this gene pair and can be differentiated from the PIL5 sites by being colored half purple.
  5. PIL5 sites: One type of annotation, PIL5 sites are transcription factor binding sites. Notice how some sites have HSPs associated with that denote sequence similarity with sites on the homeolog.
Demo-panel3.png

Modifying result graphics

Show preloaded annotations

An important feature when using MotifView is the ability to view other features such as CNSs. In the "Results Parameters" section there is the option to show preloaded annotations in the panel, including CNSs, genespace and PIL5 sites.

Further, one can restrict viewing motifs anywhere except when overlapping with any preloaded annotations. This is especially important because motifs are painted larger in the panel than they would actually appear. Not painting the motifs larger would result in invisible motifs but this representation can appear to make motifs overlap with other features when they do not. Restricting visible motifs to those that overlap with annotations eliminates any such error.

Other useful graphics modifications

Example MotifView result with hsps, genomic feature, CNSs, PIL5 sites, genespace, and motifs drawn. Note that motifs are also only restricted to viewing those that overlap with genespace and PIL5 sites
Other useful graphic modifications in the Results Parameters tab

Showing contigs.

Turning on labels for HSPs.

Drawing feature names on features

Expanding Overlapping Features and Regions of Sequence Similarity

Refining an analysis

Once a MotifView analysis has run, any of the analysis parameters can be changed and re-run after pressing the "Clear all previous analysis" button. Some existing parameters will remain and other will have to be selected again.

The common parameters changed are:

  • The extent of the genomic region analyzed. The amount to which the panel extends beyond the gene in question can be changed in the "Left sequence" and "Right sequence" boxes on the "Sequence Submission" tab. Changes to these boxes will remain in new analysis.
  • Reverse complementing sequences. This change will remain after previous analysis is cleared.
  • Also reset when previous analysis is cleared are the datasets for the sequence submissions, algorithm, and motifs selected. This means that this information will have to defined again after the user clears previous analysis

Linking to GEvo

Linking to GEvo is easy! Please see this page on how.

Tutorials

References/Downloads

For a list of all datasets with annotations, click here

For a list of all TFBS motifs used in Spangler et al., New Phytologist (2011) Evidence for Conserved Noncoding Sequence Functions in Arabidopsis thaliana. , click here

For a list of all TFBS motifs used in this site, click here

Frequently Asked Questions

Bug Report

Progress on bugs can be found here.