Commit 1ae344e2 authored by Andy Yates's avatar Andy Yates
Browse files

ENSCORESW-132. Missed out the command line options for logic name adding/skipping

parent 02cd9284
......@@ -52,7 +52,7 @@
</code></pre><p>If you do override the config then you should use the package name for your overridden config in the upcoming example commands.</p><h2 id="Environment">Environment</h2><h3 id="PERL5LIB">PERL5LIB</h3><ul><li>ensembl</li><li>ensembl-hive</li><li>bioperl</li></ul><h3 id="PATH">PATH</h3><ul><li>ensembl-hive/scripts</li><li>faToTwoBit (if not using a custom location)</li><li>xdformat (if not using a custom location)</li></ul><h3 id="ENSEMBLCVSROOTDIR">ENSEMBL_CVS_ROOT_DIR</h3><p>Set to the base checkout of Ensembl. We should be able to add <strong>ensembl-hive/sql</strong> onto this path to find the SQL directory for hive e.g.</p><pre><code> export ENSEMBL_CVS_ROOT_DIR=$HOME/work/ensembl-checkouts
</code></pre><h3 id="ENSADMINPSW">ENSADMIN_PSW</h3><p>Give the password to use to log into a database server e.g.</p><pre><code> export ENSADMIN_PSW=wibble
</code></pre><h2 id="CommandLineArguments">Command Line Arguments</h2><p>Where <strong>Multiple Supported</strong> is supported we allow the user to specify the parameter more than once on the command line. For example species is one of these options e.g. </p><pre><code>-species human -species cele -species yeast
</code></pre><table><tr><th>Name </th><th> Type</th><th>Multiple Supported</th><th> Description</th><th>Default</th><th> Required</th></tr><tr><td><code>-registry</code></td><td>String</td><td>No</td><td>Location of the Ensembl registry to use with this pipeline</td><td>-</td><td><strong>YES</strong></td></tr><tr><td><code>-base_path</code></td><td>String</td><td>No</td><td>Location of the dumps</td><td>-</td><td><strong>YES</strong></td></tr><tr><td><code>-pipeline_db -host=</code></td><td>String</td><td>No</td><td>Specify a host for the hive database e.g. <code>-pipeline_db -host=myserver.mysql</code></td><td>See hive generic config</td><td><strong>YES</strong></td></tr><tr><td><code>-pipeline_db -dbname=</code></td><td>String</td><td>No</td><td>Specify a different database to use as the hive DB e.g. <code>-pipeline_db -dbname=my_dumps_test</code></td><td>Uses pipeline name by default</td><td><strong>NO</strong></td></tr><tr><td><code>-ftp_dir</code></td><td>String</td><td>No</td><td>Location of the current FTP directory with the previous release&#8217;s files. We will use this to copy DNA files from one release to another. If not given we do not do any reuse</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-species</code></td><td>String</td><td>Yes</td><td>Specify one or more species to process. Pipeline will only <em>consider</em> these species. Use <strong>-force_species</strong> if you want to force a species run</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-force_species</code></td><td>String</td><td>Yes</td><td>Specify one or more species to force through the pipeline. This is useful to force a dump when you know reuse will not do the <em>"right thing"</em></td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-dump_types</code></td><td>String</td><td>Yes</td><td>Specify each type of dump you want to produce. Supported values are <strong>dna</strong>, <strong>cdna</strong> and <strong>ncrna</strong></td><td>All</td><td><strong>NO</strong></td></tr><tr><td><code>-db_types</code></td><td>String</td><td>Yes</td><td>The database types to use. Supports the normal db adaptor groups e.g. <strong>core</strong>, <strong>otherfeatures</strong> etc.</td><td>core</td><td><strong>NO</strong></td></tr><tr><td><code>-release</code></td><td>Integer</td><td>No</td><td>The release to dump</td><td>Software version</td><td><strong>NO</strong></td></tr><tr><td><code>-previous_release</code></td><td>Integer</td><td>No</td><td>The previous release to use. Used to calculate reuse</td><td>Software version minus 1</td><td><strong>NO</strong></td></tr><tr><td><code>-blast_servers</code></td><td>String</td><td>Yes</td><td>The servers to copy blast indexes to</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-blast_genomic_dir</code></td><td>String</td><td>No</td><td>Location to copy the DNA blast indexes to</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-blast_genes_dir</code></td><td>String</td><td>No</td><td>Location to copy the DNA gene (cdna, ncrna and protein) indexes to</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-scp_user</code></td><td>String</td><td>No</td><td>User to perform the SCP as. Defaults to the current user</td><td>Current user</td><td><strong>NO</strong></td></tr><tr><td><code>-scp_identity</code></td><td>String</td><td>No</td><td>The SSH identity file to use when performing SCPs. Normally used in conjunction with <strong>-scp_user</strong></td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-no_scp</code></td><td>Boolean</td><td>No</td><td>Skip SCP altogether</td><td>0</td><td><strong>NO</strong></td></tr><tr><td><code>-pipeline_name</code></td><td>String</td><td>No</td><td>Name to use for the pipeline</td><td>fasta_dump_$release</td><td><strong>NO</strong></td></tr><tr><td><code>-wublast_exe</code></td><td>String</td><td>No</td><td>Location of the WUBlast indexing binary</td><td>xdformat</td><td><strong>NO</strong></td></tr><tr><td><code>-blat_exe</code></td><td>String</td><td>No</td><td>Location of the Blat indexing binary</td><td>faToTwoBit</td><td><strong>NO</strong></td></tr><tr><td><code>-port_offset</code></td><td>Integer</td><td>No</td><td>The offset of the ports to use when generating blat indexes. This figure is added onto the web database species ID</td><td>30000</td><td><strong>NO</strong></td></tr><tr><td><code>-email</code></td><td>String</td><td>No</td><td>Email to send pipeline summaries to upon its successful completion</td><td>$USER@sanger.ac.uk</td><td><strong>NO</strong></td></tr></table><h2 id="ExampleCommands">Example Commands</h2><h3 id="Toloadusenormally">To load use normally:</h3><pre><code> init_pipeline.pl Bio::EnsEMBL::Pipeline::PipeConfig::FASTA_conf \
</code></pre><table><tr><th>Name </th><th> Type</th><th>Multiple Supported</th><th> Description</th><th>Default</th><th> Required</th></tr><tr><td><code>-registry</code></td><td>String</td><td>No</td><td>Location of the Ensembl registry to use with this pipeline</td><td>-</td><td><strong>YES</strong></td></tr><tr><td><code>-base_path</code></td><td>String</td><td>No</td><td>Location of the dumps</td><td>-</td><td><strong>YES</strong></td></tr><tr><td><code>-pipeline_db -host=</code></td><td>String</td><td>No</td><td>Specify a host for the hive database e.g. <code>-pipeline_db -host=myserver.mysql</code></td><td>See hive generic config</td><td><strong>YES</strong></td></tr><tr><td><code>-pipeline_db -dbname=</code></td><td>String</td><td>No</td><td>Specify a different database to use as the hive DB e.g. <code>-pipeline_db -dbname=my_dumps_test</code></td><td>Uses pipeline name by default</td><td><strong>NO</strong></td></tr><tr><td><code>-ftp_dir</code></td><td>String</td><td>No</td><td>Location of the current FTP directory with the previous release&#8217;s files. We will use this to copy DNA files from one release to another. If not given we do not do any reuse</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-species</code></td><td>String</td><td>Yes</td><td>Specify one or more species to process. Pipeline will only <em>consider</em> these species. Use <strong>-force_species</strong> if you want to force a species run</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-force_species</code></td><td>String</td><td>Yes</td><td>Specify one or more species to force through the pipeline. This is useful to force a dump when you know reuse will not do the <em>"right thing"</em></td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-dump_types</code></td><td>String</td><td>Yes</td><td>Specify each type of dump you want to produce. Supported values are <strong>dna</strong>, <strong>cdna</strong> and <strong>ncrna</strong></td><td>All</td><td><strong>NO</strong></td></tr><tr><td><code>-db_types</code></td><td>String</td><td>Yes</td><td>The database types to use. Supports the normal db adaptor groups e.g. <strong>core</strong>, <strong>otherfeatures</strong> etc.</td><td>core</td><td><strong>NO</strong></td></tr><tr><td><code>-process_logic_names</code></td><td>String</td><td>Yes</td><td>Provide a set of logic names whose models should be dumped</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-skip_logic_names</code></td><td>String</td><td>Yes</td><td>Provide a set of logic names to skip when creating dumps. These are evaluated <strong>after</strong> <code>-process_logic_names</code></td><td>core</td><td><strong>NO</strong></td></tr><tr><td><code>-release</code></td><td>Integer</td><td>No</td><td>The release to dump</td><td>Software version</td><td><strong>NO</strong></td></tr><tr><td><code>-previous_release</code></td><td>Integer</td><td>No</td><td>The previous release to use. Used to calculate reuse</td><td>Software version minus 1</td><td><strong>NO</strong></td></tr><tr><td><code>-blast_servers</code></td><td>String</td><td>Yes</td><td>The servers to copy blast indexes to</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-blast_genomic_dir</code></td><td>String</td><td>No</td><td>Location to copy the DNA blast indexes to</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-blast_genes_dir</code></td><td>String</td><td>No</td><td>Location to copy the DNA gene (cdna, ncrna and protein) indexes to</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-scp_user</code></td><td>String</td><td>No</td><td>User to perform the SCP as. Defaults to the current user</td><td>Current user</td><td><strong>NO</strong></td></tr><tr><td><code>-scp_identity</code></td><td>String</td><td>No</td><td>The SSH identity file to use when performing SCPs. Normally used in conjunction with <strong>-scp_user</strong></td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-no_scp</code></td><td>Boolean</td><td>No</td><td>Skip SCP altogether</td><td>0</td><td><strong>NO</strong></td></tr><tr><td><code>-pipeline_name</code></td><td>String</td><td>No</td><td>Name to use for the pipeline</td><td>fasta_dump_$release</td><td><strong>NO</strong></td></tr><tr><td><code>-wublast_exe</code></td><td>String</td><td>No</td><td>Location of the WUBlast indexing binary</td><td>xdformat</td><td><strong>NO</strong></td></tr><tr><td><code>-blat_exe</code></td><td>String</td><td>No</td><td>Location of the Blat indexing binary</td><td>faToTwoBit</td><td><strong>NO</strong></td></tr><tr><td><code>-port_offset</code></td><td>Integer</td><td>No</td><td>The offset of the ports to use when generating blat indexes. This figure is added onto the web database species ID</td><td>30000</td><td><strong>NO</strong></td></tr><tr><td><code>-email</code></td><td>String</td><td>No</td><td>Email to send pipeline summaries to upon its successful completion</td><td>$USER@sanger.ac.uk</td><td><strong>NO</strong></td></tr></table><h2 id="ExampleCommands">Example Commands</h2><h3 id="Toloadusenormally">To load use normally:</h3><pre><code> init_pipeline.pl Bio::EnsEMBL::Pipeline::PipeConfig::FASTA_conf \
-pipeline_db -host=my-db-host -base_path /path/to/dumps -registry reg.pm
</code></pre><h3 id="Runasubsetofspeciesnoforcingsupportsregistryaliases">Run a subset of species (no forcing &amp; supports registry aliases):</h3><pre><code> init_pipeline.pl Bio::EnsEMBL::Pipeline::PipeConfig::FASTA_conf \
-pipeline_db -host=my-db-host -species anolis -species celegans -species human \
......
......@@ -125,6 +125,8 @@ bc. -species human -species cele -species yeast
|@-force_species@|String|Yes|Specify one or more species to force through the pipeline. This is useful to force a dump when you know reuse will not do the _"right thing"_|-|*NO*|
|@-dump_types@|String|Yes|Specify each type of dump you want to produce. Supported values are *dna*, *cdna* and *ncrna*|All|*NO*|
|@-db_types@|String|Yes|The database types to use. Supports the normal db adaptor groups e.g. *core*, *otherfeatures* etc.|core|*NO*|
|@-process_logic_names@|String|Yes|Provide a set of logic names whose models should be dumped|-|*NO*|
|@-skip_logic_names@|String|Yes|Provide a set of logic names to skip when creating dumps. These are evaluated *after* @-process_logic_names@|core|*NO*|
|@-release@|Integer|No|The release to dump|Software version|*NO*|
|@-previous_release@|Integer|No|The previous release to use. Used to calculate reuse|Software version minus 1|*NO*|
|@-blast_servers@|String|Yes|The servers to copy blast indexes to|-|*NO*|
......
......@@ -30,6 +30,10 @@ sub default_options {
force_species => [],
process_logic_names => [],
skip_logic_names => [],
release => software_version(),
previous_release => (software_version() - 1),
......@@ -95,6 +99,10 @@ sub pipeline_analyses {
{
-logic_name => 'DumpDNA',
-module => 'Bio::EnsEMBL::Pipeline::FASTA::DumpFile',
-parameters => {
process_logic_names => $self->o('process_logic_names'),
skip_logic_names => $self->o('skip_logic_names'),
},
-can_be_empty => 1,
-flow_into => {
1 => 'ConcatFiles'
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment