beekeeper.html 7.14 KB
Newer Older
Matthieu Muffato's avatar
Matthieu Muffato committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58
<?xml version="1.0" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>beekeeper.pl</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<link rev="made" href="mailto:root@localhost" />
</head>

<body style="background-color: white">



<h1 id="NAME">NAME</h1>

<pre><code>    beekeeper.pl</code></pre>

<h1 id="DESCRIPTION">DESCRIPTION</h1>

<pre><code>    The Beekeeper is in charge of interfacing between the Queen and a compute resource or &#39;compute farm&#39;.
    Its job is to initialize/sync the eHive database (via the Queen), query the Queen if it needs any workers
    and to send the requested number of workers to open machines via the runWorker.pl script.

    It is also responsible for interfacing with the Queen to identify workers which died
    unexpectedly so that she can free the dead workers and reclaim unfinished jobs.</code></pre>

<h1 id="USAGE-EXAMPLES">USAGE EXAMPLES</h1>

<pre><code>        # Usually run after the pipeline has been created to calculate the internal statistics necessary for eHive functioning
    beekeeper.pl -url mysql://username:secret@hostname:port/ehive_dbname -sync

        # Do not run any additional Workers, just check for the current status of the pipeline:
    beekeeper.pl -url mysql://username:secret@hostname:port/ehive_dbname

        # Run the pipeline in automatic mode (-loop), run all the workers locally (-meadow_type LOCAL) and allow for 3 parallel workers (-total_running_workers_max 3)
    beekeeper.pl -url mysql://username:secret@hostname:port/long_mult_test -meadow_type LOCAL -total_running_workers_max 3 -loop

        # Run in automatic mode, but only restrict to running the &#39;fast_blast&#39; analysis
    beekeeper.pl -url mysql://username:secret@hostname:port/long_mult_test -logic_name fast_blast -loop

        # Restrict the normal execution to one iteration only - can be used for testing a newly set up pipeline
    beekeeper.pl -url mysql://username:secret@hostname:port/long_mult_test -run

        # Reset failed &#39;buggy_analysis&#39; jobs to &#39;READY&#39; state, so that they can be run again
    beekeeper.pl -url mysql://username:secret@hostname:port/long_mult_test -reset_failed_jobs_for_analysis buggy_analysis

        # Do a cleanup: find and bury dead workers, reclaim their jobs
    beekeeper.pl -url mysql://username:secret@hostname:port/long_mult_test -dead</code></pre>

<h1 id="OPTIONS">OPTIONS</h1>

<h2 id="Connection-parameters">Connection parameters</h2>

<pre><code>    -reg_conf &lt;path&gt;       : path to a Registry configuration file
    -reg_type &lt;string&gt;     : type of the registry entry (&#39;hive&#39;, &#39;core&#39;, &#39;compara&#39;, etc - defaults to &#39;hive&#39;)
    -reg_alias &lt;string&gt;    : species/alias name for the Hive DBAdaptor
    -url &lt;url string&gt;      : url defining where hive database is located</code></pre>

59 60 61 62
<h2 id="Configs-overriding">Configs overriding</h2>

<pre><code>    -config_file &lt;string&gt;  : json file (with absolute path) to override the default configurations (could be multiple)</code></pre>

Matthieu Muffato's avatar
Matthieu Muffato committed
63 64 65 66 67 68 69
<h2 id="Looping-control">Looping control</h2>

<pre><code>    -loop                  : run autonomously, loops and sleeps
    -max_loops &lt;num&gt;       : perform max this # of loops in autonomous mode
    -keep_alive            : do not stop when there are no more jobs to do - carry on looping
    -job_id &lt;job_id&gt;       : run 1 iteration for this job_id
    -run                   : run 1 iteration of automation loop
70
    -sleep &lt;num&gt;           : when looping, sleep &lt;num&gt; minutes (default 1 min)</code></pre>
Matthieu Muffato's avatar
Matthieu Muffato committed
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128

<h2 id="Current-Meadow-control">Current Meadow control</h2>

<pre><code>    -meadow_type &lt;string&gt;               : the desired Meadow class name, such as &#39;LSF&#39; or &#39;LOCAL&#39;
    -total_running_workers_max &lt;num&gt;    : max # workers to be running in parallel
    -submit_workers_max &lt;num&gt;           : max # workers to create per loop iteration
    -submission_options &lt;string&gt;        : passes &lt;string&gt; to the Meadow submission command as &lt;options&gt; (formerly lsf_options)
    -submit_log_dir &lt;dir&gt;               : record submission output+error streams into files under the given directory (to see why some workers fail after submission)</code></pre>

<h2 id="Worker-control">Worker control</h2>

<pre><code>    -job_limit &lt;num&gt;            : #jobs to run before worker can die naturally
    -life_span &lt;num&gt;            : life_span limit for each worker
    -logic_name &lt;string&gt;        : restrict the pipeline stat/runs to this analysis logic_name
    -retry_throwing_jobs 0|1    : if a job dies *knowingly*, should we retry it by default?
    -can_respecialize &lt;0|1&gt;     : allow workers to re-specialize into another analysis (within resource_class) after their previous analysis was exhausted
    -hive_log_dir &lt;path&gt;        : directory where stdout/stderr of the hive is redirected
    -debug &lt;debug_level&gt;        : set debug level of the workers</code></pre>

<h2 id="Other-commands-options">Other commands/options</h2>

<pre><code>    -help                  : print this help
    -versions              : report both Hive code version and Hive database schema version
    -dead                  : detect all unaccounted dead workers and reset their jobs for resubmission
    -alldead               : tell the database all workers are dead (no checks are performed in this mode, so be very careful!)
    -balance_semaphores    : set all semaphore_counts to the numbers of unDONE fan jobs (emergency use only)
    -no_analysis_stats     : don&#39;t show status of each analysis
    -worker_stats          : show status of each running worker
    -failed_jobs           : show all failed jobs
    -reset_job_id &lt;num&gt;    : reset a job back to READY so it can be rerun
    -reset_failed_jobs_for_analysis &lt;logic_name&gt;
                           : reset FAILED jobs of an analysis back to READY so they can be rerun
    -reset_all_jobs_for_analysis &lt;logic_name&gt;
                           : reset ALL jobs of an analysis back to READY so they can be rerun</code></pre>

<h1 id="LICENSE">LICENSE</h1>

<pre><code>    Copyright [1999-2014] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute

    Licensed under the Apache License, Version 2.0 (the &quot;License&quot;); you may not use this file except in compliance with the License.
    You may obtain a copy of the License at

         http://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software distributed under the License
    is distributed on an &quot;AS IS&quot; BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and limitations under the License.</code></pre>

<h1 id="CONTACT">CONTACT</h1>

<pre><code>    Please subscribe to the Hive mailing list:  http://listserver.ebi.ac.uk/mailman/listinfo/ehive-users  to discuss Hive-related questions or to be notified of our updates</code></pre>


</body>

</html>