Commits · 776757435cee450e8a095d7539fc1401b6afd29b · ensembl-gh-mirror / ensembl-hive

This project is mirrored from https://:*****@github.com/Ensembl/ensembl-hive.git. Pull mirroring updated 29 minutes ago.

Oct 20, 2004
- switched back to analysis_job.input_id · 77675743
  Jessica Severin authored 20 years ago
```
changed to varchar(255) (but dropped joining to analysis_data table)
If modules need more than 255 characters of input_id
they can pass the anaysis_data_id via the varchar(255) : example {adid=>365902}
```
  77675743
- removed SimpleRuleAdaptor · 41edb25e
  Jessica Severin authored 20 years ago
  
  41edb25e
- modified so that batch_size is dynamically pulled from hive so that · d9421d8e
  Jessica Severin authored 20 years ago
```
workers can change batch_size as they run.
```
  d9421d8e
- added some design comments · 4e1bd2ee
  Jessica Severin authored 20 years ago
  
  4e1bd2ee
Oct 19, 2004

extended input_id syntax: · 911d6847

Jessica Severin authored 20 years ago

1) input_id is the command
2) input_id is formated like '{did=>123}'
  where did is short hand for analysis_data_id and the real command
  is stored in the analysis_data table

911d6847

added simple RunnableDB which that the input_id and runs it as a system call. · 3decbe7b

Jessica Severin authored 20 years ago

All STDOUT and STDERR from the command are autoamtically captured and redirected
to files (locations stored in analysis_job_file table).
Very simple idea, but might prove useful.

3decbe7b

Oct 18, 2004
- added file Hive.pm which is all the 'use' statements needed to access · 23d4c786
  Jessica Severin authored 20 years ago
```
all parts of the Hive system.  Allows one to have a single use/include
use Bio::EnsEMBL::Hive;
```
  23d4c786
Oct 15, 2004
- SimpleRule and SimpleRuleAdaptor are now deprecated · a0933754
  Jessica Severin authored 20 years ago
```
Use instead DataflowRule and DataflowRuleAdaptor
```
  a0933754
Oct 12, 2004
- refactored the logic for update_analysis_stats so that it updates the stats · 687358d8
  Jessica Severin authored 20 years ago
```
on all anlyses, not just the ones with entries in the analysis_job table.
New logic is also faster.
```
  687358d8
Oct 08, 2004
- added back cmd -run which runs 1 loop of autonomous operation (uses same · 44b38e8a
  Jessica Severin authored 20 years ago
```
logic as -loop option, but returns right away). Also modified checkk_for_dead
to take into account jobs with 'EXIT' status.
```
  44b38e8a
Oct 06, 2004
- new script systemTests.pl to test job creation and for future performance tests · 0811160f
  Jessica Severin authored 20 years ago
  
  0811160f
- last tweaks to restore performance for separating input_id into analysis_data · 072c029a
  Jessica Severin authored 20 years ago
```
table.  Doing join on analysis_job.input_analysis_data_id=analysis_data.analysis_data_id
gives same performance as having analysis_job.input_id in table rather than second query
```
  072c029a
Oct 05, 2004

changed index on analysis_data (data(100)) · 72c4b287
Jessica Severin authored 20 years ago

72c4b287

Second insert into analysis_data for job_creation added extra overhead. · f7182485

Jessica Severin authored 20 years ago

Removed select before store (made new method store_if_needed if that functionality is required by users)
and added option in AnalysisJobAdaptor::CreateNewJob to pass input_analysis_data_id
so if already know the CreateNewJob will be as fast as before. Plus there are no limits on the
size of the input_id string.

f7182485

Oct 04, 2004
- Analysis::stats method will now return a new AnalysisStats ready for storing · d642eb3c
  Jessica Severin authored 20 years ago
```
if one isn't in the database
```
  d642eb3c
Sep 30, 2004

modified analysis_job table : replaced input_id varchar(100) with · 2be90ea9

Jessica Severin authored 20 years ago

input_analysis_data_id int(10) which joins to analysis_data table.
added output_analysis_data_id int(10) for storing output_id
External analysis_data.data is LongText which will allow much longer
parameter sets to be passed around than was previously possible.
AnalysisData will also allow processes to manually store 'other' data and
pass it around via ID reference now.

2be90ea9

debugged syntax · 49fed033
Jessica Severin authored 20 years ago

49fed033

Sep 27, 2004
- added check in print_stats case there is an AnalysisStats entry but there · 80a294b5
  Jessica Severin authored 20 years ago
```
isn't a corresponding Analysis entry in the analysis table
```
  80a294b5
- removed debug print of sql command · 1bf20386
  Jessica Severin authored 20 years ago
  
  1bf20386
- added disconnect on fetch_input · a668af73
  Jessica Severin authored 20 years ago
  
  a668af73
Sep 23, 2004
- added AnalysisDataAdaptor · 97a12a7b
  Jessica Severin authored 20 years ago
  
  97a12a7b
Sep 22, 2004

creating jobs on the fly via -input_id now has better fault trapping · 612bc67a
Jessica Severin authored 20 years ago

612bc67a

added analysis_data table and adaptor. · 0c2c5dc5

Jessica Severin authored 20 years ago

Estentially a mini filesystem so that data that would normally be stored in
NFS files and referenced via a path, can now be stored in the database and
referenced via a dbID. Data is a LONGTEXT.
Can be used to store configuration data, paramater strings,
BLOSSUM matrix data, uuencode of binary data .....

0c2c5dc5

Aug 31, 2004
- get_available_adaptors to return ref not actual list · 23b30141
  Ian Longden authored 20 years ago
  
  23b30141
Aug 27, 2004
- Added get_available_adaptors to get the pairs of name / adaptor modules. This... · 9efa5e15
  Ian Longden authored 20 years ago
```
Added get_available_adaptors to get the pairs of name / adaptor modules. This should make it easier to read etc
```
  9efa5e15
Aug 14, 2004
- changed index on analysis_job to match extended claim update which also checks status · ec06ff7b
  Jessica Severin authored 20 years ago
  
  ec06ff7b
Aug 11, 2004
- fixed bug in blocking/unblocking via ctrl_rules · 585edf27
  Jessica Severin authored 20 years ago
  
  585edf27
- changed logic of check_blocking_control_rules so that if will re-BLOCK · e599ddde
  Jessica Severin authored 20 years ago
```
an analysis if one of it's conditions are not fullfilled.  Needed for case
when system is done, and new data is flowed through the system (progressive runs).
```
  e599ddde
- fixed bug if condition_analysis_url didn't point to a valid analysis · f6e175a0
  Jessica Severin authored 20 years ago
  
  f6e175a0
Aug 10, 2004

added throttle to control the maximum number of workers to be created in · 75f5d016

Jessica Severin authored 20 years ago

each cycle of loop. This is to even out the start/stop waves to make it
easier for others to get jobs started on LSF and to reduce
the startup mysql load that can happen when 700 workers all birth at once.
Defaults to 50 (every 5 minutes), but can be altered with -wlimit option

75f5d016

Aug 09, 2004
- added smple check of number of pending jobs for user. Reduced queen's · 30a85386
  Jessica Severin authored 20 years ago
```
worker request by number of jobs PENDing to prevent excessive queuing of workers.
```
  30a85386
Aug 07, 2004
- if worker throws an exception, we can still register the worker as dead with the queen. · d9fb1a93
  Jessica Severin authored 20 years ago
```
If not a clean exit, it will record it as a FATALITY and reset it's jobs right away.
```
  d9fb1a93
Aug 06, 2004

added -input_id option to create a job on the fly with the specified input_id · c67c77e9

Jessica Severin authored 20 years ago

should also specify logic_name or analysis_id so that the input_id is run on the
correct analysis. Doesn't insert job into database. Designed for testing RunnableDB's
but may prove useful in other contexts.

c67c77e9

added methods reset_job and global_cleanup to... · e13c7f1f

Jessica Severin authored 20 years ago

added methods reset_job and global_cleanup to Bio::EnsEMBL::Pipeline::RunnableDB via category extension.
Worker calls global_cleanup on it's runnableDB after all jobs are done.

e13c7f1f

worker_check_in now also updates work_done so one can monitor how quickly it's progressing · a58579dd
Jessica Severin authored 20 years ago

a58579dd

Aug 04, 2004
- removed deprecated RunnableDB batch_size method call · 0a16809b
  Jessica Severin authored 20 years ago
  
  0a16809b
Aug 03, 2004
- default to disconnect_when_inactive(0). So RunnableDB's must now · a7f94d54
  Jessica Severin authored 20 years ago
```
turn disconnect ON, when there will be lots of them and they have
moments when there will be little DB activity.  The new disconnect system
disconnects so much, that it's slower than before, so must use sparingly.
```
  a7f94d54
- RunnableDB::Dummy is just a data mover (input, through rules, create · d99291c5
  Jessica Severin authored 20 years ago
```
output jobs) is it needs a fast database, so don't disconnect_when_inactive
```
  d99291c5
- removed deprecated batch_size and carrying_capacity methods · 3a85907e
  Jessica Severin authored 20 years ago
  
  3a85907e
- removed references to Bio::EnsEMBL::Root and it's inherited methods · c81b5371
  Jessica Severin authored 20 years ago
```
created new() methods where needed, replaced throw, rearrange as needed
```
  c81b5371