Load Genome Script: Difference between revisions

From CoGepedia
Jump to navigation Jump to search
No edit summary
No edit summary
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
The load genome script, load_genome.pl, allows genomes to be created via the backend.
The load genome script, scripts/load_genome.pl, allows genomes to be created from FASTA files via the backend.


Usage:
Some data are required to exist in the database prior to running this script:
* an organism to specify in the organism_id parameter
* a user to specify in the user_id parameter
* a genomic_sequence_type to specify in the type_id parameter
 
'''Usage:'''
<pre>
<pre>
perl load_genome.pl -name <string> -desc <string> ...
perl load_genome.pl -name <string> -desc <string> -fasta_files <file1>,<file2>,...<fileN> ...
</pre>
</pre>


Required parameters:
'''Required parameters:'''
* fasta_files
* fasta_files
** comma-separated list of FASTA files
** comma-separated list of FASTA files
* staging_dir  temporary staging directory for processing files, use "."
* staging_dir   
* install_dir      permanent installation directory for genome files with DATADIR in configuration file
** temporary staging directory for processing files, use "."
* user_id        user ID
* install_dir       
* config           configuration file
** permanent installation directory for genome files
** should match SEQDIR in the configuration file
** example:  /opt/apache2/coge/data/genomic_sequence/
* user_id         
** ID for user to associate the genome
* organism_id     
** Organism ID
* source_name   
** Name of data source, e.g. the lab that generated the sequence data
* config
** CoGe configuration file (web/coge.conf)


Optional parameters:
'''Optional parameters:'''
* name               String name of the genome
* name              
* desc                String description of the genome
** String name of the genome
* link                  URL to the data source or publication
* desc                 
* version            Version of the genome data
** String description of the genome
* type_id            Sequence type ID, defaults to 1 for "unmasked"
* link                   
* restricted          Flag to make genome private (1) or public (0)
** URL to the data source or publication
* organism_id      Organism ID
* version             
* source_name    Name of data source, e.g. the lab that generated the sequence data
** Version of the genome data
* source_desc    Description of the data source
* type_id             
** Sequence type ID, defaults to 1 for "unmasked"
* source_desc   
** Description of the data source
* restricted           
** Flag to make genome private (1) or public (0), defaults to public

Latest revision as of 17:57, 16 February 2015

The load genome script, scripts/load_genome.pl, allows genomes to be created from FASTA files via the backend.

Some data are required to exist in the database prior to running this script:

  • an organism to specify in the organism_id parameter
  • a user to specify in the user_id parameter
  • a genomic_sequence_type to specify in the type_id parameter

Usage:

perl load_genome.pl -name <string> -desc <string> -fasta_files <file1>,<file2>,...<fileN> ...

Required parameters:

  • fasta_files
    • comma-separated list of FASTA files
  • staging_dir
    • temporary staging directory for processing files, use "."
  • install_dir
    • permanent installation directory for genome files
    • should match SEQDIR in the configuration file
    • example: /opt/apache2/coge/data/genomic_sequence/
  • user_id
    • ID for user to associate the genome
  • organism_id
    • Organism ID
  • source_name
    • Name of data source, e.g. the lab that generated the sequence data
  • config
    • CoGe configuration file (web/coge.conf)

Optional parameters:

  • name
    • String name of the genome
  • desc
    • String description of the genome
  • link
    • URL to the data source or publication
  • version
    • Version of the genome data
  • type_id
    • Sequence type ID, defaults to 1 for "unmasked"
  • source_desc
    • Description of the data source
  • restricted
    • Flag to make genome private (1) or public (0), defaults to public