</code></pre><p>If you do override the config then you should use the package name for your overridden config in the upcoming example commands.</p><h2id="Environment">Environment</h2><h3id="PERL5LIB">PERL5LIB</h3><ul><li>ensembl</li><li>ensembl-hive</li><li>bioperl</li></ul><h3id="PATH">PATH</h3><ul><li>ensembl-hive/scripts</li><li>faToTwoBit (if not using a custom location)</li><li>xdformat (if not using a custom location)</li></ul><h3id="ENSEMBLCVSROOTDIR">ENSEMBL_CVS_ROOT_DIR</h3><p>Set to the base checkout of Ensembl. We should be able to add <strong>ensembl-hive/sql</strong> onto this path to find the SQL directory for hive e.g.</p><pre><code> export ENSEMBL_CVS_ROOT_DIR=$HOME/work/ensembl-checkouts
</code></pre><h3id="ENSADMINPSW">ENSADMIN_PSW</h3><p>Give the password to use to log into a database server e.g.</p><pre><code> export ENSADMIN_PSW=wibble
</code></pre><h2id="CommandLineArguments">Command Line Arguments</h2><p>Where <strong>Multiple Supported</strong> is supported we allow the user to specify the parameter more than once on the command line. For example species is one of these options e.g. </p><pre><code>-species human -species cele -species yeast@
</code></pre><h2id="CommandLineArguments">Command Line Arguments</h2><p>Where <strong>Multiple Supported</strong> is supported we allow the user to specify the parameter more than once on the command line. For example species is one of these options e.g. </p><pre><code>-species human -species cele -species yeast
</code></pre><table><tr><th>Name </th><th> Type</th><th>Multiple Supported</th><th> Description</th><th>Default</th><th> Required</th></tr><tr><td><code>-registry</code></td><td>String</td><td>No</td><td>Location of the Ensembl registry to use with this pipeline</td><td>-</td><td><strong>YES</strong></td></tr><tr><td><code>-base_path</code></td><td>String</td><td>No</td><td>Location of the dumps</td><td>-</td><td><strong>YES</strong></td></tr><tr><td><code>-pipeline_db -host=</code></td><td>String</td><td>No</td><td>Specify a host for the hive database e.g. <code>-pipeline_db -host=myserver.mysql</code></td><td>See hive generic config</td><td><strong>YES</strong></td></tr><tr><td><code>-pipeline_db -dbname=</code></td><td>String</td><td>No</td><td>Specify a different database to use as the hive DB e.g. <code>-pipeline_db -dbname=my_dumps_test</code></td><td>Uses pipeline name by default</td><td><strong>NO</strong></td></tr><tr><td><code>-ftp_dir</code></td><td>String</td><td>No</td><td>Location of the current FTP directory with the previous release’s files. We will use this to copy DNA files from one release to another. If not given we do not do any reuse</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-species</code></td><td>String</td><td>Yes</td><td>Specify one or more species to process. Pipeline will only <em>consider</em> these species. Use <strong>-force_species</strong> if you want to force a species run</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-force_species</code></td><td>String</td><td>Yes</td><td>Specify one or more species to force through the pipeline. This is useful to force a dump when you know reuse will not do the <em>"right thing"</em></td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-dump_types</code></td><td>String</td><td>Yes</td><td>Specify each type of dump you want to produce. Supported values are <strong>dna</strong>, <strong>cdna</strong> and <strong>ncrna</strong></td><td>All</td><td><strong>NO</strong></td></tr><tr><td><code>-db_types</code></td><td>String</td><td>Yes</td><td>The database types to use. Supports the normal db adaptor groups e.g. <strong>core</strong>, <strong>otherfeatures</strong> etc.</td><td>core</td><td><strong>NO</strong></td></tr><tr><td><code>-release</code></td><td>Integer</td><td>No</td><td>The release to dump</td><td>Software version</td><td><strong>NO</strong></td></tr><tr><td><code>-previous_release</code></td><td>Integer</td><td>No</td><td>The previous release to use. Used to calculate reuse</td><td>Software version minus 1</td><td><strong>NO</strong></td></tr><tr><td><code>-blast_servers</code></td><td>String</td><td>Yes</td><td>The servers to copy blast indexes to</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-blast_genomic_dir</code></td><td>String</td><td>No</td><td>Location to copy the DNA blast indexes to</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-blast_genes_dir</code></td><td>String</td><td>No</td><td>Location to copy the DNA gene (cdna, ncrna and protein) indexes to</td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-scp_user</code></td><td>String</td><td>No</td><td>User to perform the SCP as. Defaults to the current user</td><td>Current user</td><td><strong>NO</strong></td></tr><tr><td><code>-scp_identity</code></td><td>String</td><td>No</td><td>The SSH identity file to use when performing SCPs. Normally used in conjunction with <strong>-scp_user</strong></td><td>-</td><td><strong>NO</strong></td></tr><tr><td><code>-no_scp</code></td><td>Boolean</td><td>No</td><td>Skip SCP altogether</td><td>0</td><td><strong>NO</strong></td></tr><tr><td><code>-pipeline_name</code></td><td>String</td><td>No</td><td>Name to use for the pipeline</td><td>fasta_dump_$release</td><td><strong>NO</strong></td></tr><tr><td><code>-wublast_exe</code></td><td>String</td><td>No</td><td>Location of the WUBlast indexing binary</td><td>xdformat</td><td><strong>NO</strong></td></tr><tr><td><code>-blat_exe</code></td><td>String</td><td>No</td><td>Location of the Blat indexing binary</td><td>faToTwoBit</td><td><strong>NO</strong></td></tr><tr><td><code>-port_offset</code></td><td>Integer</td><td>No</td><td>The offset of the ports to use when generating blat indexes. This figure is added onto the web database species ID</td><td>30000</td><td><strong>NO</strong></td></tr><tr><td><code>-email</code></td><td>String</td><td>No</td><td>Email to send pipeline summaries to upon its successful completion</td><td>$USER@sanger.ac.uk</td><td><strong>NO</strong></td></tr></table><h2id="ExampleCommands">Example Commands</h2><h3id="Toloadusenormally">To load use normally:</h3><pre><code> init_pipeline.pl Bio::EnsEMBL::Pipeline::PipeConfig::FASTA_conf \
</code></pre><h3id="Runasubsetofspeciesnoforcingsupportsregistryaliases">Run a subset of species (no forcing & supports registry aliases):</h3><pre><code> init_pipeline.pl Bio::EnsEMBL::Pipeline::PipeConfig::FASTA_conf \
Where *Multiple Supported* is supported we allow the user to specify the parameter more than once on the command line. For example species is one of these options e.g.
bc. -species human -species cele -species yeast@
bc. -species human -species cele -species yeast
|_. Name |_. Type|_. Multiple Supported|_. Description|_. Default|_. Required|
|@-registry@|String|No|Location of the Ensembl registry to use with this pipeline|-|*YES*|