Difference between revisions of "EPIC-CoGe"

From CoGepedia
Jump to: navigation, search
(Try EPIC-CoGe)
(Overview)
Line 15: Line 15:
 
*Stores functional and diversity data for all genomes in CoGe
 
*Stores functional and diversity data for all genomes in CoGe
 
** Gene expression, RNASeq, methylation, SNPs, etc.
 
** Gene expression, RNASeq, methylation, SNPs, etc.
*Provide dynamic visualization through CoGe's Genome Viewer (Based on JBrowse)
+
*Provide dynamic visualization through CoGe's [[GenomeView]]
 
*Provide data management tools for these data to '''easily'''
 
*Provide data management tools for these data to '''easily'''
** A new functional and diversity datasets/experiments
+
** Add new functional and diversity datasets/experiments
** Keep them private
+
** Keep them private or make them fully public
 
** Share them with collaborators
 
** Share them with collaborators
 
** Integrate with public data
 
** Integrate with public data
** Make them fully public
 
  
 
==Reference==
 
==Reference==

Revision as of 11:25, 15 September 2016


Funding for the EPIC-CoGe Browser Project is provided by: The Gordon and Betty Moore Foundation
Funding for the EPIC-CoGe Browser Project is provided by: The USDA/NIFA

Overview

EPIC-CoGe provides data management and visualization tools to let anyone

  • integrate new genomes into CoGe
  • add function (E.g., RNAseq, methylation) and diversity (SNPs, haplotype) data to those genomes
  • visualize them in an interactive genome browser (JBrowse)
  • perform advanced data analysis.

This project is an extension to CoGe that:

  • Stores functional and diversity data for all genomes in CoGe
    • Gene expression, RNASeq, methylation, SNPs, etc.
  • Provide dynamic visualization through CoGe's GenomeView
  • Provide data management tools for these data to easily
    • Add new functional and diversity datasets/experiments
    • Keep them private or make them fully public
    • Share them with collaborators
    • Integrate with public data

Reference

EPIC CoGe Reference

Tutorial

See the EPIC-CoGe Tutorial for videos and step-by-step instructions for getting started.

Try EPIC-CoGe

This link will take you to EPIC-CoGe loaded with Arabidopsis thaliana: http://genomevolution.org/r/939v.

Videos

Screenshots

Untitled.001.png
Visualization of Diversity and Functional Genomics Data. Note that SNP density is viewed as histograms.
Visualization of Diversity and Functional Genomics Data. Note that individual SNPs are visualized at higher zoom levels.

Long-term vision

  • While originally developed for Arabidopsis, EPIC-CoGe is now extend to all genomes in CoGe
  • Permit users to upload their own data, keep it private, share with collaborators, and make public upon publication
  • Expand data storage engine to include all types of quantitative genomic data including
    • Expression profiles
    • RNASeq
    • Copy number variation
    • SNPs
    • QTLs

Summary

How eukaryotic organisms regulate mRNA levels is a fundamental question in biology. Most of the early attention was focused on the study of gene transcription, while only recently posttranscriptional mechanisms have gained recognition for their regulatory importance. These epigenetic regulatory pathways control mRNA levels both transcriptionally and posttranscriptionally, and pioneering work in Arabidopsis thaliana has helped define these processes. For this reason, there is a wealth of epigenomic information already available for this model plant. However, it is almost entirely unusable to the wider research community due to the computational intensive procedures needed to leverage these data resources. For this reason, we will develop an easy to use web-based system to store, access, and visualize Arabidopsis epigenetic data in a comparative genomics context: the EPIC-CoGe Browser.

The EPIC-CoGe Browser will consist of four major subsystems:

  • A data storage subsystem that can store thousands of epigenetic experiments and provide rapid access to those data.
  • A web-based visualization subsystem that permits overlaying and partitioning of epigenetics data on genomic data.
  • A user interface subsystem to allow researchers to find and select sets of epigenetic experiments for visualization.
  • A user interface subsystem to allow researchers to customize how to mesh and visualize their selected epigenetics experiments.

The EPIC-CoGe Browser will synthesize existing investments from three NSF funded projects: EPIC, CoGe, and the iPlant Collaborative. EPIC, whose mission is “reading the second [genetic] code [of life by] mapping epigenomes to understand plant growth, development and adaptation to the environment,” is currently funded as a Research Coordination Network. Their primary goal has been to coordinate the research activities of the international community and develop a whitepaper to drive this effort. However, this community currently lacks a computational browser to access and visualize epigenetic data. Also, their research interests are diverse. While much of the epigenetic community originally focused on the model plant system, Arabidopsis thaliana, the community research interests span all plants, including those of agronomic importance for global food safety and sustainability. However, to achieve such broad applicability, the EPIC-CoGe Browser requires scalable computing resources and data management systems.

The iPlant Collaborative is a large investment by the NSF to create cyberinfrastructure (CI) for the plant research community. Cyberinfrastructure is made up of extensible, scalable, and capable computing resources, and “domain expertise”, which includes computer science, mathematics, statistics, algorithms, and all disciplines of plant biology. iPlant is building and deploying the software systems necessary to connect supercomputing resources (XSEDE) to computational biologists, bench biologists, field biologists, and plant breeders. The comparative genomics platform, CoGe, is part of the “powered by iPlant” program. CoGe utilizes iPlant’s CI in order to achieve the scalability necessary to serve the entire comparative genomics community for all domains of life (CoGe currently makes available 16,500 genomes from ~13,000 organisms). In addition, CoGe provides a suite of web-based tools for in-depth analyses and comparisons of genomic data. The EPIC-CoGe Browser will be an extension of CoGe and likewise a member of the Powered by iPlant program to access the required scalable and capable computational resources.

While year one of this project will focus on public epigenetics data for Arabidopsis thaliana and developing the four subsystems described above, as the technology continues to improve for amassing epigenetics data easily and inexpensively, the need for the EPIC-CoGe Browser will continue to grow as more plant species are investigated. Year two of the project will focus on catering to the needs of the epigenetic research community by: 1. providing researchers with more data management and collaboration tools, 2. supporting additional organisms, and 3. supporting advanced comparative analyses and publication quality images. Data management and collaboration tools are required for on-going research with pre-publication data. These systems will permit researchers to add their own data to EPIC-CoGe, share those data among a group of researchers, and restrict their public access, while also being able to engage the broader community for soliciting help and analytical expertise. EPIC-CoGe will engage the rice and maize research communities in order to expand EPIC-CoGe Browser’s capabilities into additional species, and specifically those with agronomic and food safety importance. By being based on the CoGe system, which inherently supports thousands of organisms, these examples will permit the expansion of EPIC-CoGe to all domains of life. In addition, CoGe provides many tools for comparative genomics, and the data visualizations of EPIC-CoGe will be adapted for use in these tools. This synthesis of data and analytical tools will permit information from well-studied plants to be leveraged for less understood plants.

In order to best meet the needs of the plant epigenetic research community, year two will also focus on soliciting feedback from scientists through online questionnaires, discussion forums, and workshops. The workshops will be held at national and international conferences that the Co-PIs regularly attend: Gregory at the International Conference on Arabidopsis Research, and Lyons at the Annual Maize Genetics Conference.

By leveraging these resources provided by the NSF, the support of the Betty and Gordon Moore Foundation for the first two years of the EPIC-CoGe Browser development will create the synergistic glue required to make epigenetic data available to the widest group of international researchers. Such support will be leveraged for the long-term viability of all three projects through new funding opportunities from domestic, international, and industrial partnerships. Currently, Co-PIs Lyons and Gregory have a proposal with the NSF Plant Genome Research Program to provide support for this project starting in year three and focused on rice epigentic data. In addition, a functional prototype EPIC-CoGe Browser has been deployed: http://genomevolution.org/CoGe/GenomeView.pl?gid=16911&viewer=JBrowse

If you have any feedback, please email the CoGe Team.

Support from GBMF will ensure the completion of the goals outlined above and provide much needed resources for the international epigenetics community.

Adding data

You can add data to EPIC-CoGe. All you need is:

Then use LoadExperiment to add your data to a genome in CoGe.

Support

This web site is funded by the Gordon and Betty Moore Foundation through Grant GBMF3383 to Eric Lyons. This project is a collaboration between the labs of Eric Lyons at the University of Arizona and Brian Gregrory at the University of Pennsylvania.