Commit aad3c9c8 authored by Matthieu Muffato's avatar Matthieu Muffato
Browse files

Many corrections suggested by @ens-emily

parent 82fae692
......@@ -5,10 +5,10 @@ Runnable API
eHive exposes an interface for Runnables (jobs) to interact with the
system:
- query their own parameters. See :ref:`parameters-in-jobs`
- control its own execution and report issues
- run system commands
- trigger some *dataflow* events (e.g. create new jobs)
- query their own parameters (see :ref:`parameters-in-jobs`),
- control its own execution and report issues,
- run system commands,
- trigger some *dataflow* events (e.g. create new jobs).
Reporting and logging
......@@ -18,7 +18,7 @@ Jobs can log messages to the standard output with the
``$self->say_with_header($message, $important)`` method. However they are only printed
when the *debug* mode is enabled (see below) or when the ``$important`` flag is switched on.
They will also be prefixed with a standard prefix consisting of the
runtime context (worker, role, job).
runtime context (Worker, Role, Job).
The debug mode is controlled by the ``--debug X`` option of
:ref:`script-beekeeper` and :ref:`script-runWorker`. *X* is an integer,
......@@ -29,16 +29,16 @@ check whether it is 0 or not.
(so that the messages are printed on the standard output) but also stores
them in the database (in the ``log_message`` table).
To indicate that a job has to be terminated earlier (i.e. before reaching
To indicate that a Job has to be terminated earlier (i.e. before reaching
the end of ``write_output``), you can call:
- ``$self->complete_early($message)`` to mark the job as *DONE*
- ``$self->complete_early($message)`` to mark the Job as *DONE*
(successful run) and record the message in the database. Beware that this
will trigger the *autoflow*.
- ``$self->complete_early($message, $branch_code)`` is a variation of the
above that will replace the autoflow (branch 1) with a dataflow on the
branch given
- ``$self->throw($message)`` to log a failed attempt. The job may be given
branch given.
- ``$self->throw($message)`` to log a failed attempt. The Job may be given
additional retries following the analysis' *max_retry_count* parameter,
or is marked as *FAILED* in the database.
......@@ -59,7 +59,7 @@ around this method).
meta-characters and delimiters such as ``>`` (to redirect the output to a
file), ``;`` (to separate two commands that have to be run sequentially)
or ``|`` (a pipe) and will be quoted and joined and passed to ``system``
as a single string
as a single string.
#. An hashref of options. Accepted options are:
- ``use_bash_pipefail``: Normally, the exit status of a pipeline (e.g.
......@@ -67,27 +67,27 @@ around this method).
errors in the first command are not captured. With the option turned
on, the exit status of the pipeline will capture errors in any command
of the pipeline, and will only be 0 if *all* the commands exit
successfully
successfully.
- ``use_bash_errexit``: Exit immediately if a command fails. This is
mostly useful for cases like ``cmd1; cmd2`` where by default, ``cmd2``
would always be executed, regardless of the exit status of ``cmd1``
would always be executed, regardless of the exit status of ``cmd1``.
- ``timeout``: the maximum number of seconds the command is allowed to
run for. The exit status will be set to -2 if the command had to be
aborted
aborted.
During their execution, jobs may certainly have to use temporary files.
eHive provides a directory that will exist throughout the lifespan of the
worker with the ``$self->worker_temp_directory`` method. The directory is created
the first time the method is called, and deleted when the worker ends. It is the Runnable's
Worker with the ``$self->worker_temp_directory`` method. The directory is created
the first time the method is called, and deleted when the Worker ends. It is the Runnable's
responsibility to leave the directory in a clean-enough state for the next
job (by removing some files, for instance), or to clean it up completely
Job (by removing some files, for instance), or to clean it up completely
with ``$self->cleanup_worker_temp_directory``.
By default, this directory will be put under /tmp, but it can be overriden
by adding a ``worker_temp_directory_name`` method to the runnable. This can
be used to:
- use a faster filesystem (although /tmp is usually local to the machine)
- use a faster filesystem (although /tmp is usually local to the machine),
- use a network filesystem (needed for distributed applications, e.g. over
MPI). See :ref:`worker_temp_directory_name-mpi` in the :ref:`howto-mpi` section.
......@@ -107,9 +107,9 @@ $branch_number)`` method.
The payload ``$data`` must be of one of these types:
- Hash-reference that maps parameter names (strings) to their values.
- Array-reference of hash-references of the above type
- ``undef`` to propagate the job's input_id
- hash-reference that maps parameter names (strings) to their values,
- array-reference of hash-references of the above type,
- ``undef`` to propagate the job's input_id.
The branch number defaults to 1 and can be skipped. Generally speaking, it
has to be an integer.
......@@ -121,6 +121,6 @@ to easily generate events. The method takes two arguments:
#. The path to a file containing one JSON object per line. Each line can be
prefixed with a branch number (and some whitespace), which will override
the default branch number.
#. The default branch number (defaults to 1 too)
#. The default branch number (defaults to 1 too).
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment