ensembl-hive  2.1
 All Classes Namespaces Files Functions Pages
Bio::EnsEMBL::Hive::PipeConfig::LongMult_conf Class Reference
+ Inheritance diagram for Bio::EnsEMBL::Hive::PipeConfig::LongMult_conf:
+ Collaboration diagram for Bio::EnsEMBL::Hive::PipeConfig::LongMult_conf:

Public Member Functions

public pipeline_create_commands ()
 
public pipeline_wide_parameters ()
 
public pipeline_analyses ()
 
- Public Member Functions inherited from Bio::EnsEMBL::Hive::PipeConfig::HiveGeneric_conf
public default_options ()
 
public pipeline_create_commands ()
 
public pipeline_wide_parameters ()
 
public resource_classes ()
 
public pipeline_analyses ()
 
public beekeeper_extra_cmdline_options ()
 
public hive_meta_table ()
 
public pre_options ()
 
public dbconn_2_mysql ()
 
public dbconn_2_pgsql ()
 
public db_connect_command ()
 
public db_execute_command ()
 
public dbconn_2_url ()
 
public pipeline_url ()
 
public db_cmd ()
 
public pipeline_name ()
 
public process_options ()
 
public overridable_pipeline_create_commands ()
 
public run_pipeline_create_commands ()
 
public add_objects_from_config ()
 
public useful_commands_legend ()
 
- Public Member Functions inherited from Bio::EnsEMBL::Hive::DependentOptions
public new ()
 
public use_cases ()
 
public load_cmdline_options ()
 
public root ()
 
public is_fully_substituted_string ()
 
public is_fully_substituted_structure ()
 
public hash_leaves ()
 
public o ()
 
public substitute ()
 
public merge_from_rules ()
 
public process_options ()
 

Detailed Description

Synopsis

# initialize the database and build the graph in it (it will also print the value of EHIVE_URL) :
init_pipeline.pl Bio::EnsEMBL::Hive::PipeConfig::LongMult_conf -password <mypass>
# optionally also seed it with your specific values:
seed_pipeline.pl -url $EHIVE_URL -logic_name take_b_apart -input_id '{ "a_multiplier" => "12345678", "b_multiplier" => "3359559666" }'
# run the pipeline:
beekeeper.pl -url $EHIVE_URL -loop

Description

    This is the PipeConfig file for the long multiplication pipeline example.
    The main point of this pipeline is to provide an example of how to write Hive Runnables and link them together into a pipeline.

    Please refer to Bio::EnsEMBL::Hive::PipeConfig::HiveGeneric_conf module to understand the interface implemented here.

    The setting. let's assume we are given two loooooong numbers to multiply. reeeeally long.
    soooo long that they do not fit into registers of the cpu and should be multiplied digit-by-digit.
    For the purposes of this example we also assume this task is very computationally intensive and has to be done in parallel.

    The long multiplication pipeline consists of three "analyses" (types of tasks):
        'take_b_apart', 'part_multiply' and 'add_together' that we use to examplify various features of the Hive.

          A 'take_b_apart' job takes in two string parameters, 'a_multiplier' and 'b_multiplier',
          takes the second one apart into digits, finds what _different_ digits are there,
          creates several jobs of the 'part_multiply' analysis and one job of 'add_together' analysis.

          A 'part_multiply' job takes in 'a_multiplier' and 'digit', multiplies them and accumulates the result in 'partial_product' accumulator.

          An 'add_together' job waits for the first two analyses to complete,
          takes in 'a_multiplier', 'b_multiplier' and 'partial_product' hash and produces the final result in 'final_result' table.

    Please see the implementation details in Runnable modules themselves.

Member Function Documentation

public Bio::EnsEMBL::Hive::PipeConfig::LongMult_conf::pipeline_analyses ( )
    Description : Implements pipeline_analyses() interface method of Bio::EnsEMBL::Hive::PipeConfig::HiveGeneric_conf that defines the structure of the pipeline: analyses, jobs, rules, etc.
                  Here it defines three analyses:
                      'take_b_apart' that is auto-seeded with a pair of jobs (to check the commutativity of multiplication).
                      Each job will dataflow (create more jobs) via branch #2 into 'part_multiply' and via branch #1 into 'add_together'.
                      'part_multiply' with jobs fed from take_b_apart#2.
                        It multiplies input parameters 'a_multiplier' and 'digit' and dataflows 'partial_product' parameter into branch #1.
                      'add_together' with jobs fed from take_b_apart#1.
                        It adds together results of partial multiplication computed by 'part_multiply'.
                        These results are accumulated in 'partial_product' hash.
                        Until the hash is complete the corresponding 'add_together' job is blocked by a semaphore.
 
Code:
click to view
public Bio::EnsEMBL::Hive::PipeConfig::LongMult_conf::pipeline_create_commands ( )
    Description : Implements pipeline_create_commands() interface method of Bio::EnsEMBL::Hive::PipeConfig::HiveGeneric_conf that lists the commands that will create and set up the Hive database.
                  In addition to the standard creation of the database and populating it with Hive tables and procedures it also creates two pipeline-specific tables used by Runnables to communicate.
 
Code:
click to view
public Bio::EnsEMBL::Hive::PipeConfig::LongMult_conf::pipeline_wide_parameters ( )
    Description : Interface method that should return a hash of pipeline_wide_parameter_name->pipeline_wide_parameter_value pairs.
                  The value doesn't have to be a scalar, can be any Perl structure now (will be stringified and de-stringified automagically).
                  Please see existing PipeConfig modules for examples.
 
Code:
click to view

The documentation for this class was generated from the following file: