CNS Discovery Pipeline: Difference between revisions
No edit summary |
|||
Line 3: | Line 3: | ||
The CNS Discovery Pipeline was originally created by Brent Pedersen. His version of the software [https://github.com/brentp/find_cns is avaliable on github]. | The CNS Discovery Pipeline was originally created by Brent Pedersen. His version of the software [https://github.com/brentp/find_cns is avaliable on github]. | ||
Ongoing development of the CNS Discovery Pipeline in now handled by Gina Turco. Download the latest version of the source code [https://github.com/gturco/find_cns | Ongoing development of the CNS Discovery Pipeline in now handled by Gina Turco. Download the latest version of the source code [https://github.com/gturco/find_cns CNS Discovery Pipeline] | ||
A more detailed explanation of the CNS Discovery Pipeline has been [http://www.frontiersin.org/Journal/10.3389/fpls.2013.00170/full published in Frontiers in Plants] | |||
==Example== | ==Example== | ||
Line 17: | Line 19: | ||
*Schnable, J.C. et al. (2011). [http://www.frontiersin.org/plant_genetics_and_genomics/10.3389/fpls.2011.00002/full Dose-sensitivity, conserved noncoding sequences and duplicate gene retention through multiple tetraploidies in the grasses]. Front. Plant Sci. 2:2.<br> | *Schnable, J.C. et al. (2011). [http://www.frontiersin.org/plant_genetics_and_genomics/10.3389/fpls.2011.00002/full Dose-sensitivity, conserved noncoding sequences and duplicate gene retention through multiple tetraploidies in the grasses]. Front. Plant Sci. 2:2.<br> | ||
*Turco, G. et al. (2013). [http://www.frontiersin.org/Journal/10.3389/fpls.2013.00170/full Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses.] Front. Plant Sci. 4:170. <br> | |||
==Public CNS Datasets== | ==Public CNS Datasets== |
Revision as of 03:09, 29 September 2013
The CNS Discovery Pipeline is a suite of software developed within the Freeling lab to identify orthologs or homeologs and compare the regions surrounding the genes to identify regions of noncoding DNA which have retained greater than expected sequence similarity, indicating the sequences (conserved noncoding sequences or CNS) are under functional constraint.
The CNS Discovery Pipeline was originally created by Brent Pedersen. His version of the software is avaliable on github.
Ongoing development of the CNS Discovery Pipeline in now handled by Gina Turco. Download the latest version of the source code CNS Discovery Pipeline
A more detailed explanation of the CNS Discovery Pipeline has been published in Frontiers in Plants
Example
The above example compares the coding sequence surrounding two homeologs from [Plant paleopolyploidy#2 Arabidopsis alpha|the Arabidopsis alpha tetraploidy] AT1G03170 and AT4G02810. Manually identified CNS based on blast searchers are marked below the dotted line in teal and CNS identified above the dotted line are drawn above the dotted line in purple. This example analysis can be regenerated using this link: http://genomevolution.org/r/4bpo
Publications Utilizing the CNS Discovery Pipeline
May be out of date.
- Woodhouse, M.R. et al. (2010). Following tetraploidy in maize, a short deletion mechanism removed genes preferentially from one of the two homeologs. PLoS Biol 8: e1000409.
- Schnable, J.C. et al. (2011). Dose-sensitivity, conserved noncoding sequences and duplicate gene retention through multiple tetraploidies in the grasses. Front. Plant Sci. 2:2.
- Turco, G. et al. (2013). Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses. Front. Plant Sci. 4:170.
Public CNS Datasets
CNS loaded in Coge
- turco,pedersen,and freeling 2011 automated thaliana_v10 thaliana_v10 cns (dsgid:16746)
- turco,pedersen,and freeling 2011 automated thaliana_v9 thaliana_v9 cns (dsgid:8084)
- turco,pedersen,and freeling 2011 automated thaliana_v8 thaliana_v8 cns (dsgid:19494,39598)
- thaliana_v8 thaliana_v8 cns Golden CNS (dsgid:19494,39598)
- turco,pedersen,and freeling 2012 automated rice maize cns (dsgid:11266,42128)
- turco,pedersen,and freeling 2012 automated rice sorghum cns (dsgid:11822,11821)
- turco,pedersen,and freeling 2012 automated sorghum sorghum (dsgid:11821)
- turco,pedersen,and freeling 2012 automated setaria setaria (dsgid:19491)
- turco,pedersen,and freeling 2012 automated rice setaria cns (dsgid:11822,19491)
- turco,pedersen,and freeling 2011 automated 2012 sorghum brachy cns (dsgid:8120)
- turco,pedersen,and freeling 2011 automated 2012 rice brachy cns (dsgid:8120)
- turco,pedersen,and freeling 2012 pan-grass cns (sorghum,setaria,rice,maize,brachy) (dsgid:16896)
Rice-Sorghum orthologous CNS
These CNS were identified by comparing syntenic orthologs in rice (TIGR5) And Sorghum (v1.4) using version 2.0 of the CNS Discovery Pipeline. The citation for this dataset is Schnable JC et al 2011, above.