diff --git a/README.md b/README.md index 215e969d88912d5363290f9b2e7f02157834a7f5..f71c99babec4772f197551ee04de07ddbd450014 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,9 @@ -EnsEMBL Hive -============ +eHive +===== -EnsEMBL Hive is a system for running computation pipelines on distributed computing resources - clusters, farms or grids. +eHive is a system for running computation pipelines on distributed computing resources - clusters, farms or grids. -The name "Hive" comes from the way pipelines are processed by a swarm of autonomous agents. +The name comes from the way pipelines are processed by a swarm of autonomous agents. Blackboard, Jobs and Workers ---------------------------- @@ -28,7 +28,7 @@ However in some sense an Analysis also acts as a "container" for them. PipeConfig file defines Analyses and dependency rules of the pipeline --------------------------------------------------------------------- -Hive pipeline databases are molded according to PipeConfig files which are Perl modules conforming to a special interface. +eHive pipeline databases are molded according to PipeConfig files which are Perl modules conforming to a special interface. A PipeConfig file defines the stucture of the pipeline, which is a graph whose nodes are Analyses (with their code, parameters and resource requirements) and edges are various dependency rules: * Dataflow rules define how data that flows out of an Analysis can be used to trigger creation of Jobs in other Analyses @@ -44,12 +44,17 @@ There are also other parameters of Analyses that control, for example: * what should be autimatically done with a Job if it needs more memory/time, etc. +Available documentation +----------------------- +* eHive dependencies, installation and setup [(on GitHub)](http://htmlpreview.github.io/?https://github.com/Ensembl/ensembl-hive/blob/master/docs/install.html) [(local)](docs/install.html) +* eHive database schema [(on GitHub)](http://htmlpreview.github.io/?https://github.com/Ensembl/ensembl-hive/blob/master/docs/hive_schema.html) [(local)](docs/hive_schema.html) + Contact us (mailing list) ------------------------- -EnsEMBL Hive was originally conceived and used within EnsEMBL Compara group +eHive was originally conceived and used within EnsEMBL Compara group for running Comparative Genomics pipelines, but since then it has been separated into a separate software tool and is used in many projects both in Genome Campus, Cambridge and outside. -There is a Hive users' mailing list for questions, suggestions, discussions and announcements. +There is eHive users' mailing list for questions, suggestions, discussions and announcements. To subscribe to it please visit: http://listserver.ebi.ac.uk/mailman/listinfo/ehive-users diff --git a/docs/install.html b/docs/install.html new file mode 100644 index 0000000000000000000000000000000000000000..beef43a0333fc9e7bc66bac4885dde2a7208f90e --- /dev/null +++ b/docs/install.html @@ -0,0 +1,131 @@ +<html> + <head> + <title>eHive installation and setup</title> + </head> +<body> + + <center><h1>eHive installation and setup</h1></center> + <hr width=50% /> + + <center><h2>eHive dependencies</h2></center> + + eHive system depends on the following components that you may need to download and install first: + <ol> + <li>Perl 5.10 <a href=http://www.perl.org/get.html>or higher</a></li> + <li>A database engine of your choice. eHive keeps its state in a database, so you will need + <ol> + <li>a server installed on the machine where you want to maintain the state of your pipeline(s) and</li> + <li>clients installed on the machines where the jobs are to be executed.</li> + </ol> + At the moment, the following database options are available: + <ul> + <li>MySQL 5.1 <a href=http://dev.mysql.com/downloads/>or higher</a></li> + <li>SQLite 3.6 <a href=http://www.sqlite.org/download.html>or higher</a></li> + <li>PostgreSQL 9.2 <a href=http://www.postgresql.org/download/>or higher</a></li> + </ul> + </li> + <li>Perl DBI API version 1.6 <a href=http://dbi.perl.org/>or higher</a> -- + Perl database interface that has to include a driver for the database engine of your choice above. + </li> + <li>Perl libraries for visualisation (optional but recommended). They can be found on CPAN: + <ul> + <li><a href=http://search.cpan.org/~rsavage/GraphViz/lib/GraphViz.pm>GraphViz</a> (needed for generate_graph.pl and the GUI)</li> + <li><a href=http://search.cpan.org/dist/Chart-Gnuplot/lib/Chart/Gnuplot.pm>Chart::Gnuplot</a> (needed for generate_timeline.pl)</li> + </ul> + </li> + </li> + + </ol> + + <hr width=50% /> + + <center><h2>Installing EnsEMBL core and eHive code</h2></center> + +<h3>Create a directory for EnsEMBL repositories:</h3> + +It is advised to have a dedicated directory where EnsEMBL-related packages will be deployed. +Unlike DBI modules that can be installed system-wide by the system administrator, +you will benefit from full (read+write) access to the EnsEMBL files/directories, +so it is best to install them under your home directory. For example, + +<pre> + $ mkdir $HOME/ensembl_main +</pre> + +<h3>Set a variable pointing at this directory:</h3> + +<ul> +<li><i>using bash syntax:</i> +<pre> + $ export ENSEMBL_CVS_ROOT_DIR="$HOME/ensembl_main"<i> + # + # (for best results, append this line to your ~/.bashrc or ~/.bash_profile configuration file)</i> +</pre></li> + +<li><i>using [t]csh syntax:</i> +<pre> + $ setenv ENSEMBL_CVS_ROOT_DIR "$HOME/ensembl_main"<i> + # + # (for best results, append this line to your ~/.cshrc or ~/.tcshrc configuration file)</i> +</pre></li> +</ul> + +<h3>Change into your EnsEMBL codebase directory:</h3> + +<pre> + $ cd $ENSEMBL_CVS_ROOT_DIR +</pre> + +<h3>Check out the EnsEMBL repositories by cloning them from GitHub:</h3> + +<ol> +<li><i>EnsEMBL core API:</i> +<pre> + git clone https://github.com/Ensembl/ensembl.git +</pre></li> +<li><i>eHive code:</i> +<pre> + git clone https://github.com/Ensembl/ensembl-hive.git +</pre></li> +</ol> + +<h3>Add new packages to the PERL5LIB variable:</h3> + +<ul> +<li><i>using bash syntax:</i> +<pre> + $ export PERL5LIB=${PERL5LIB}:${ENSEMBL_CVS_ROOT_DIR}/ensembl/modules + $ export PERL5LIB=${PERL5LIB}:${ENSEMBL_CVS_ROOT_DIR}/ensembl-hive/modules<i> + # + # (for best results, append these lines to your ~/.bashrc or ~/.bash_profile configuration file)</i> +</pre></li> + +<li><i>using [t]csh syntax:</i> +<pre> + $ setenv PERL5LIB ${PERL5LIB}:${ENSEMBL_CVS_ROOT_DIR}/ensembl/modules + $ setenv PERL5LIB ${PERL5LIB}:${ENSEMBL_CVS_ROOT_DIR}/ensembl-hive/modules<i> + # + # (for best results, append these lines to your ~/.cshrc or ~/.tcshrc configuration file)</i> +</pre></li> +</ul> + +<h3>Add eHive scripts' path to the PATH variable (optional but useful) :</h3> + +<ul> +<li><i>using bash syntax:</i> +<pre> + $ export PATH=$PATH:$ENSEMBL_CVS_ROOT_DIR/ensembl-hive/scripts<i> + # + # (for best results, append this line to your ~/.bashrc or ~/.bash_profile configuration file)</i> +</pre></li> + +<li><i>using [t]csh syntax:</i> +<pre> + $ set path = ( $path ${ENSEMBL_CVS_ROOT_DIR}/ensembl-hive/scripts )<i> + # + # (for best results, append this line to your ~/.cshrc or ~/.tcshrc configuration file)</i> +</pre></li> +</ul> + +</body> +</html>