eHive installation and setup

eHive dependencies

eHive system depends on the following components that you may need to download and install first:
  1. Perl 5.14 or higher
  2. A database engine of your choice. eHive keeps its state in a database, so you will need
    1. a server installed on the machine where you want to maintain the state of your pipeline(s) and
    2. clients installed on the machines where the jobs are to be executed.
    At the moment, the following database options are available:
  3. Perl DBI API version 1.6 or higher -- Perl database interface that has to include a driver for the database engine of your choice above.
  4. Perl libraries for visualisation (optional but recommended). They can be found on CPAN:

Installing eHive code

Check out the repository by cloning it from GitHub:

All eHive pipelines will require the ensembl-hive repository, which can be found on GitHub. As such it is assumed that Git is installed on your system, if not follow the instructions here

To download the repository, move to a suitable directory and run the following on the command line:

        git clone -b version/2.2

This will create ensembl-hive directory with all the code and documentation.
If you cd into the ensembl-hive directory and do an ls you should see something like the following:

        Changelog  docs  hive_config.json  modules  scripts  sql  t
The major directories here are:
This contains all the eHive modules, which are written in Perl
Has various scripts that are key to initialising, running and debugging the pipeline
Contains sql used to build a standard pipeline database

Optional configuration of the system:

You may find it convenient (although it is not necessary) to add "ensembl-hive/scripts" to your $PATH variable to make it easier to run and other useful Hive scripts.

Also, if you are developing the code and not just running ready pipelines, you may find it convenient to add "ensembl-hive/modules" to your $PERL5LIB variable.