Ancestral Reconstruction Pipeline

From CoGepedia
Revision as of 15:36, 24 April 2014 by Elyons (talk | contribs)
Jump to navigation Jump to search

This page is to document the Ancestral Reconstruction Pipeline by Chunfang Zheng

Master control is from her batch script: batchFile.txt


#compile 
#gets gene pairs from SynMap output
javac TestGetGenomes.java
#run with config file
#config file:
#number of genomes and number
java TestGetGenomes data/inputInfoCoGe.txt


javac TestGetContigInput.java
java TestGetContigInput data/inputInfoAGRP.txt
cd outputFiles
python contigInput_8400_9050_10997_19515.py> contigOutput.txt
cd ..
javac TestGetContigOutputAndScaffoldInput.java
java TestGetContigOutputAndScaffoldInput data/inputInfoAGRP.txt
cd outputFiles
python scaffoldInput1.py > scaffoldOutput1.txt
python scaffoldInput2.py > scaffoldOutput2.txt
python scaffoldInput3.py > scaffoldOutput3.txt
python scaffoldInput4.py > scaffoldOutput4.txt
python scaffoldInput5.py > scaffoldOutput5.txt
python scaffoldInput6.py > scaffoldOutput6.txt
python scaffoldInput7.py > scaffoldOutput7.txt
cd ..
javac TestScaffoldOutput.java
java TestScaffoldOutput


inputInfo example file (describes input from CoGe)

#obvious
numberOfGenomes	4
numberOfGenomePairs	9

8400	9050	data/8400_9050.CDS-CDS.last.tdd10.cs0.filtered.dag.all.go_D20_g10_A5.aligncoords.Dm0.ma1.qac3.3.40.gcoords
10997	8400	data/10997_8400.CDS-CDS.last.tdd10.cs0.filtered.dag.all.go_D20_g10_A5.aligncoords.Dm0.ma1.qac3.3.40.gcoords
10997	9050	data/10997_9050.CDS-CDS.last.tdd10.cs0.filtered.dag.all.go_D20_g10_A5.aligncoords.Dm0.ma1.qac3.3.40.gcoords
10997	19515	data/10997_19515.CDS-CDS.last.tdd10.cs0.filtered.dag.all.go_D20_g10_A5.aligncoords.Dm0.ma1.qac3.1.40.gcoords
19515	8400	data/19515_8400.CDS-CDS.last.tdd10.cs0.filtered.dag.all.go_D20_g10_A5.aligncoords.Dm0.ma1.qac1.3.40.gcoords
19515	9050	data/19515_9050.CDS-CDS.last.tdd10.cs0.filtered.dag.all.go_D20_g10_A5.aligncoords.Dm0.ma1.qac1.3.40.gcoords
8400	8400	data/8400_8400.CDS-CDS.last.tdd10.filtered.dag.all.go_D20_g10_A5.aligncoords.Dm0.ma1.qac3.3.40.gcoords
9050	9050	data/9050_9050.CDS-CDS.last.tdd10.filtered.dag.all.go_D20_g10_A5.aligncoords.Dm0.ma1.qac3.3.40.gcoords
10997	10997	data/10997_10997.CDS-CDS.last.tdd10.filtered.dag.all.go_D20_g10_A5.aligncoords.Dm0.ma1.qac3.3.40.gcoords
8400	3
9050	3
10997	3
19515	1
data/subGenomeRegions.txt