DigitFactory.pm 4.38 KB
Newer Older
1 2 3 4
=pod 

=head1 NAME

5
    Bio::EnsEMBL::Hive::RunnableDB::LongMult::DigitFactory
6

Leo Gordon's avatar
Leo Gordon committed
7 8
=head1 SYNOPSIS

9 10
    Please refer to Bio::EnsEMBL::Hive::PipeConfig::LongMult_conf pipeline configuration file
    to understand how this particular example pipeline is configured and ran.
Leo Gordon's avatar
Leo Gordon committed
11

12 13
=head1 DESCRIPTION

14 15 16 17 18 19 20 21 22 23
    'LongMult::DigitFactory' is the first step of the LongMult example pipeline that multiplies two long numbers.

    It takes apart the second multiplier and creates several 'LongMult::PartMultiply' jobs
    that correspond to the different digits of the second multiplier.

    It also "flows into" one 'LongMult::AddTogether' job that will wait until 'LongMult::PartMultiply' jobs
    complete and will arrive at the final result.

=head1 LICENSE

24
    Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute
Matthieu Muffato's avatar
Matthieu Muffato committed
25
    Copyright [2016-2018] EMBL-European Bioinformatics Institute
26

27 28
    Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.
    You may obtain a copy of the License at
29

30 31 32 33 34 35 36 37
         http://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software distributed under the License
    is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and limitations under the License.

=head1 CONTACT

38
    Please subscribe to the Hive mailing list:  http://listserver.ebi.ac.uk/mailman/listinfo/ehive-users  to discuss Hive-related questions or to be notified of our updates
39 40 41

=cut

42

43
package Bio::EnsEMBL::Hive::RunnableDB::LongMult::DigitFactory;
44 45 46

use strict;

47
use base ('Bio::EnsEMBL::Hive::Process');
48

49 50 51 52 53 54 55 56 57 58 59 60 61 62 63

=head2 param_defaults

    Description : Implements param_defaults() interface method of Bio::EnsEMBL::Hive::Process that defines module defaults for parameters.

=cut

sub param_defaults {

    return {
        'take_time' => 0,   # how much time run() method will spend in sleeping state
    };
}


Leo Gordon's avatar
Leo Gordon committed
64
=head2 fetch_input
65

Leo Gordon's avatar
Leo Gordon committed
66 67
    Description : Implements fetch_input() interface method of Bio::EnsEMBL::Hive::Process that is used to read in parameters and load data.
                  Here the task of fetch_input() is to read in the two multipliers, split the second one into digits and create a set of input_ids that will be used later.
68

69
    param('b_multiplier'):  The second long number (a string of digits - doesn't have to fit a register)
Leo Gordon's avatar
Leo Gordon committed
70

71 72
    param('take_time'):     How much time to spend sleeping (seconds).

Leo Gordon's avatar
Leo Gordon committed
73 74 75
=cut

sub fetch_input {
76 77
    my $self = shift @_;

78
    my $b_multiplier    = $self->param_required('b_multiplier');
79 80 81 82 83 84 85

    my %digit_hash = ();
    foreach my $digit (split(//,$b_multiplier)) {
        next if (($digit eq '0') or ($digit eq '1'));
        $digit_hash{$digit}++;
    }

86 87
        # parameter hashes of partial multiplications to be computed:
    my @sub_tasks = map { { 'digit' => $_ } } keys %digit_hash;
88 89

        # store them for future use:
90
    $self->param('sub_tasks', \@sub_tasks);
91 92
}

93

Leo Gordon's avatar
Leo Gordon committed
94 95 96
=head2 run

    Description : Implements run() interface method of Bio::EnsEMBL::Hive::Process that is used to perform the main bulk of the job (minus input and output).
97
                  Here we don't have any real work to do, just input and output, so run() just spends some time waiting.
Leo Gordon's avatar
Leo Gordon committed
98 99 100 101

=cut

sub run {
102 103 104
    my $self = shift @_;

    sleep( $self->param('take_time') );
Leo Gordon's avatar
Leo Gordon committed
105 106
}

107

Leo Gordon's avatar
Leo Gordon committed
108 109 110 111 112 113 114 115
=head2 write_output

    Description : Implements write_output() interface method of Bio::EnsEMBL::Hive::Process that is used to deal with job's output after the execution.
                  Here we dataflow all the partial multiplication jobs whose input_ids were generated in fetch_input() into the branch-2 ("fan out"),
                  and also dataflow the original task down branch-1 (create the "funnel job").

=cut

116
sub write_output {  # nothing to write out, but some dataflow to perform:
117 118
    my $self = shift @_;

119
    my $sub_tasks = $self->param('sub_tasks');
120

121 122
        # "fan out" into branch#2 first, branch#1 will be created if we wire it (and we do)
    $self->dataflow_output_id($sub_tasks, 2);
123

124
    $self->warning(scalar(@$sub_tasks).' multiplication jobs have been created');     # warning messages get recorded into 'log_message' table
125

126 127
## extra information sent to the funnel will extend its stack:
#    $self->dataflow_output_id( { 'different_digits' => scalar(@$sub_tasks) } , 1);
128 129 130 131
}

1;