Commits · a608a06c86fafb823c084869cf12b253e32298ba · ensembl-gh-mirror / ensembl-hive

This project is mirrored from https://:*****@github.com/Ensembl/ensembl-hive.git. Pull mirroring updated 29 minutes ago.

Nov 22, 2004
- updated perldoc comments · a608a06c
  Jessica Severin authored 20 years ago
  
  a608a06c
Nov 20, 2004

added routine that shows running workers when in minimal stats mode · 0ef25863
Jessica Severin authored 20 years ago

0ef25863

new lsf_beekeeper to correspond to new distributed Queen system. · 7a104fd3

Jessica Severin authored 20 years ago

no longer does Queen::synchronize_hive as part of autonomous loop
-sync option allows user to manually trigger a hard sync.
also removed default display of full hive status and addded option -status
which will print this full status.
Also removed adjusting needed worker count for 'pending' workers. lsf will
sometimes leave jobs in pending state for no apparent reason (new bsubed job
will run yet older pending job stays pending). Current 'pending' count also didn't
differentiate between lsf_beekeeper submited jobs and manually submitted
jobs. This pend adjustment isn't a critcal subsystem so I've removed it for
now.  If a runWorker starts (after a long pend) and there is no work left
it will die immeadiately.  I may rewrite a smarter 'pending' adjustment
in the future.

7a104fd3

New distributed Queen system. Queen/hive updates its state in an incremental · e3d44c7e

Jessica Severin authored 20 years ago

and distributed manner as it interacts with the workers over the course of its life.
When a runWorker.pl script starts and asks a queen to create a worker the queen has
a list of known analyses which are 'above the surface' where full hive analysis has
been done and the number of needed workers has been calculated. Full synch requires
joining data between the analysis, analysis_job, analysis_stats, and hive tables.
When this reached 10e7 jobs, 10e4 analyses, 10e3 workers a full hard sync took minutes
and it was clear this bit of the system wasn't scaling and wasn't going to make it
to the next order of magnitude. This occurred in the compara blastz pipeline between
mouse and rat.
Now there are some analyses 'below the surface' that have partial synchronization.
These analyses have been flagged as having 'x' new jobs (AnalysisJobAdaptor updating
analysis_stats on job insert). If no analysis is found to asign to the newly
created worker, the queen will dip below the surface and start checking
the analyses with the highest probablity of needing the most workers.
This incremental sync is also done in Queen::get_num_needed_workers
When calculating ahead a total worker count, this routine will also dip below
the surface until the hive reaches it's current defined worker saturation.
A beekeeper is no longer a required component for the system to function.
If workers can get onto cpus the hive will run. The beekeeper is now mainly a
user display program showing the status of the hive. There is no longer any
central process doing work and one hive can potentially scale
beyond 10e9 jobs in graphs of 10e6 analysis nodes and 10e6 running workers.

e3d44c7e

Nov 19, 2004

changed fetch_by_status to order returned results so the stats which have · af03e291
Jessica Severin authored 20 years ago
```
the most time since last update are at the top of the returned list
```
af03e291

Change for distributed smart Queen system. · c05ce49d

Jessica Severin authored 20 years ago

When jobs are inserted into the analysis_job table, the analysis_stats table
for the given analysis is updated by incrementing the total_job_count,
and unclaimed_job_count and setting the status to 'LOADING'.
If the analysis is 'BLOCKED' this incremental update does not happen.
When an analysis_stats is 'BLOCKED' and then unblocked this automatically
will trigger a resync so this progress partial update is not needed.

c05ce49d

switched analysis_stats.status default back to 'READY' so as not to trigger · 93cf124c
Jessica Severin authored 20 years ago
```
sync unless some jobs have been loaded.
```
93cf124c
removed 'SYNCHING' type from analysis_stats table since I didn't need it · 8d1fd3b3
Jessica Severin authored 20 years ago
```
also changed default to 'LOADING' so that it can trigger a sync
```
8d1fd3b3

Nov 18, 2004
- moved the encode_hash function into the 'top level' name space (main::encode_hash) · 08694552
  Jessica Severin authored 20 years ago
  
  08694552
- shortened output from print_stats so that it fits on screen better · 2855d34c
  Jessica Severin authored 20 years ago
  
  2855d34c
Nov 17, 2004
- fixed bug in fetch_by_status: need to add ',' to list of status names when passing more · cf7db69b
  Jessica Severin authored 20 years ago
```
than one status
```
  cf7db69b
- update_status method now also set the internal status variable to this · b4588cf8
  Jessica Severin authored 20 years ago
```
AnalysisStats object now hold the new status.
```
  b4588cf8
- needed to add check if job_limit is defined to smart batch_size code · 73265261
  Jessica Severin authored 20 years ago
  
  73265261
Nov 16, 2004
- fixed bug in batch_size when worker has specified job_limit. Code accidentally · 83e14500
  Jessica Severin authored 20 years ago
```
stored truncated batch_size (when job_limit < batch_size).  Fixed with a
'smart' worker->batch_size method which returns the lesser of either the
analysis->stats->batch_size or worker->job_limit
```
  83e14500
- simple doc to get something down on how to setup a hive · 74b07443
  Jessica Severin authored 20 years ago
  
  74b07443
Nov 10, 2004
- added -sleep option to help screen · 00697c57
  Jessica Severin authored 20 years ago
  
  00697c57
Nov 09, 2004

on job reset don't reset the hive_id so when debugging one can track the · ff6ee7cb
Jessica Severin authored 20 years ago
```
failed job to the failed worker.
```
ff6ee7cb
change varchar in analysis_job to char (so fixed length rows) · 8c9c1484
Jessica Severin authored 20 years ago
```
added 'SYNCHING' and 'LOADING' to analysis_stat.status
```
8c9c1484
reformated code (removed all the tabs) · 088529b5
Jessica Severin authored 20 years ago

088529b5
removed old code run_next_worker_clutch · 2f244c5b
Jessica Severin authored 20 years ago
```
added disconnect_if_idle before the sleep
```
2f244c5b
switching over to new distributed syncing system. first stage of transition · 012015b5
Jessica Severin authored 20 years ago

012015b5

refactored synchronization logic to allow for worker distributed syncing. · e6fb56d1

Jessica Severin authored 20 years ago

The synchronization of the analysis_stat summary statistics was done by
the beekeeper at the top of it's loop.  For graphs with 40,000+ analyses
this centralized syncing became a bottle neck.  This new system allows
the Queen attached to each worker process to synchronize it's analysis.
Syncing happens when a worker 'checks in' and when it dies.  The sync on
'check in' only updates if the stats are >60secs out of date to prevent
over syncing.
The beekeeper still needs to do whole system syncs when a subsection has
finished and the next section needs to be 'unblocked'.  For homology this
will happen 2 times in a 16 hour run.

e6fb56d1

removed commented out line for disconnect_when_inactive since I don't want · 9dc9569a
Jessica Severin authored 20 years ago
```
to turn this on like this anymore
```
9dc9569a

added method Bio::EnsEMBL::Pipeline::RunnableDB::encode_hash · 0a2faf3e

Jessica Severin authored 20 years ago

which takes a hash ref as a parameter and returns a string which can
be evaled back to the hash.  $hash_ref = eval(encode_hash($hash_ref));

0a2faf3e

Nov 05, 2004
- added convenience method update_status which calls $self->adaptor->update_status · e6808022
  Jessica Severin authored 20 years ago
```
allows simple $stats->update_status('DONE');
```
  e6808022
Nov 04, 2004
- updated the perldoc so it is properly formated and more informative · 48058b5a
  Jessica Severin authored 20 years ago
  
  48058b5a
Oct 27, 2004

added method Bio::EnsEMBL::Pipeline::RunnableDB::branch_code · de5270ff

Jessica Severin authored 20 years ago

so that branch_code is set explicitly rather than replying on the return
value of the write_output method.  Switched Worker.pm to use this value.

de5270ff

Oct 20, 2004
- switched back to analysis_job.input_id · 77675743
  Jessica Severin authored 20 years ago
```
changed to varchar(255) (but dropped joining to analysis_data table)
If modules need more than 255 characters of input_id
they can pass the anaysis_data_id via the varchar(255) : example {adid=>365902}
```
  77675743
- removed SimpleRuleAdaptor · 41edb25e
  Jessica Severin authored 20 years ago
  
  41edb25e
- modified so that batch_size is dynamically pulled from hive so that · d9421d8e
  Jessica Severin authored 20 years ago
```
workers can change batch_size as they run.
```
  d9421d8e
- added some design comments · 4e1bd2ee
  Jessica Severin authored 20 years ago
  
  4e1bd2ee
Oct 19, 2004

extended input_id syntax: · 911d6847

Jessica Severin authored 20 years ago

1) input_id is the command
2) input_id is formated like '{did=>123}'
  where did is short hand for analysis_data_id and the real command
  is stored in the analysis_data table

911d6847

added simple RunnableDB which that the input_id and runs it as a system call. · 3decbe7b

Jessica Severin authored 20 years ago

All STDOUT and STDERR from the command are autoamtically captured and redirected
to files (locations stored in analysis_job_file table).
Very simple idea, but might prove useful.

3decbe7b

Oct 18, 2004
- added file Hive.pm which is all the 'use' statements needed to access · 23d4c786
  Jessica Severin authored 20 years ago
```
all parts of the Hive system.  Allows one to have a single use/include
use Bio::EnsEMBL::Hive;
```
  23d4c786
Oct 15, 2004
- SimpleRule and SimpleRuleAdaptor are now deprecated · a0933754
  Jessica Severin authored 20 years ago
```
Use instead DataflowRule and DataflowRuleAdaptor
```
  a0933754
Oct 12, 2004
- refactored the logic for update_analysis_stats so that it updates the stats · 687358d8
  Jessica Severin authored 20 years ago
```
on all anlyses, not just the ones with entries in the analysis_job table.
New logic is also faster.
```
  687358d8
Oct 08, 2004
- added back cmd -run which runs 1 loop of autonomous operation (uses same · 44b38e8a
  Jessica Severin authored 20 years ago
```
logic as -loop option, but returns right away). Also modified checkk_for_dead
to take into account jobs with 'EXIT' status.
```
  44b38e8a
Oct 06, 2004
- new script systemTests.pl to test job creation and for future performance tests · 0811160f
  Jessica Severin authored 20 years ago
  
  0811160f
- last tweaks to restore performance for separating input_id into analysis_data · 072c029a
  Jessica Severin authored 20 years ago
```
table.  Doing join on analysis_job.input_analysis_data_id=analysis_data.analysis_data_id
gives same performance as having analysis_job.input_id in table rather than second query
```
  072c029a
Oct 05, 2004
- changed index on analysis_data (data(100)) · 72c4b287
  Jessica Severin authored 20 years ago
  
  72c4b287