This project is mirrored from https://:*****@github.com/Ensembl/ensembl-hive.git.
Pull mirroring updated .
- 04 Mar, 2005 6 commits
-
-
Jessica Severin authored
-
Jessica Severin authored
added columns hive_id and retry. Allows user to join to failed workers in the hive table, and to see which retry level the job was at when the STDOUT/STDERR files were generated. Sets at beginning of job run, and deletes those for 'empty' files at job end.
-
Jessica Severin authored
in a more filesystem friendly manner (creates at 256 layer hash which distributes the directories evenly and reduces concurrent directory modification. Also reordered how the job output files are saved (done at the beginning right after redirection starts, and at the end right before it's closed).
-
Jessica Severin authored
any problems related to setting undef or '0' values.
-
Jessica Severin authored
made the worse (Tim Cutts). This will do until we figure this out.... I like the '>/dev/null + rerun failed jobs manually with debug' option personally :)
-
Jessica Severin authored
-
- 03 Mar, 2005 8 commits
-
-
Jessica Severin authored
each digit becomes a directory with a final directory created with the full hive_id hive_id=1234 => <base_dir>/1/2/3/4/hive_id_1234/ hive_id=12 => <base_dir>/1/2/hive_id_12/ this should distribute the output directories
-
Jessica Severin authored
is calculated. If batch_size>0 use batch_size, else use avg_msec_per_job equation.
-
Jessica Severin authored
to RunnableDB to allow full benefit of dataflow graph capabilities. - Removed from Extension.pm branch_code, analysis_job_id, reset_job extensions to RunnableDB (no longer trying to shoe-horn hive 'extra' functions into them) - Bio::EnsEMBL::Hive::Process mirrors some of the RunnableDB interface (new, analysis, fetch_input, run, write_output) but uses a new job interface (input_job, dataflow_output_id) instead of input_id (but provides convenience method $self->input_id which redirects to $self->input_job->input_id to simplify porting) - Changed Worker to only use hive 'extended' function if the processing module isa(Bio::EnsEMBL::Hive::Process). Also allows all RunnableDB modules to still be used (or any object which implements a minimal 'RunnableDB interface') (new, input_id, db, fetch_input, run, write_output)
-
Jessica Severin authored
reordered where the blocking checks are done (added, deleted, moved).
-
Jessica Severin authored
-
Jessica Severin authored
-
Jessica Severin authored
needed workers after this worker is done. Useful in debugging one's dataflow and blocking_ctrl graphs by running one worker at a time (like stepping in a debugger)
-
Jessica Severin authored
-
- 02 Mar, 2005 3 commits
-
-
Jessica Severin authored
a job that has been flowed into an analysis/process
-
Jessica Severin authored
-
Jessica Severin authored
-
- 23 Feb, 2005 8 commits
-
-
Jessica Severin authored
-
Jessica Severin authored
added option -no_pend which ignores the pending_count when figuring out how many workers to submit removed some superfluous calls to Queen::get_num_running_workers
-
Jessica Severin authored
-
Jessica Severin authored
when debugging an analysis which fails and would increment the retry_count.
-
Jessica Severin authored
-
Jessica Severin authored
-
Jessica Severin authored
-
Jessica Severin authored
to be promoted to 'DONE'
-
- 22 Feb, 2005 1 commit
-
-
Jessica Severin authored
-
- 21 Feb, 2005 1 commit
-
-
Jessica Severin authored
needed to better manage the hive system's load on the database housing all the hive related tables (in case the database is overloaded by multiple users). Added analysis_stats.sync_lock column (and correspondly in Object and Adaptor) Added Queen::safe_synchronize_AnalysisStats method which wraps over the synchronize_AnalysisStats method and does various checks and locks to ensure that only one worker is trying to do a 'synchronize' on a given analysis at any given moment. Cleaned up API between Queen/Worker so that worker only talks directly to the Queen, rather than getting the underlying database adaptor. Added analysis_job columns runtime_msec, query_count to provide more data on how the jobs hammer a database (queries/sec).
-
- 17 Feb, 2005 2 commits
-
-
Jessica Severin authored
called when worker dies to replace itself in the needed_workers count since it's decremented when it's born, and it's counted as living (and subtracted) as long as it's running. This gunarantees that another worker will quickly be created after this one dies (and not need to wait for a synch to happen)
-
Jessica Severin authored
-
- 16 Feb, 2005 8 commits
-
-
Jessica Severin authored
is when there are lots of workers 'WORKING' so as to avoid them falling over each other. The 'WORKING' state only exists in the middle of a large run. When the last worker dies, the state is 'ALL_CLAIMED' so the sync on death will happen properly. As the last pile of workers die they will all do a synch, but that's OK since the system needs to be properly synched when the last one dies since there won't be anybody left to do it. Also added 10 minute check for if already 'SYNCHING' to deal with case if worker dies in the middle of 'SYNCHING'.
-
Jessica Severin authored
-
Jessica Severin authored
-
Jessica Severin authored
so to reduce the sychronization frequency.
-
Jessica Severin authored
-
Jessica Severin authored
call lower down isn't needed. Also needed to move the printing of the analysis_stats up higher to better display with the new printing order. Now -loop -analysis_stats looks right.
-
Jessica Severin authored
added check/set of status to 'SYNCHING' right before the synch procedure so as to prevent multiple workers from simultaneously trying to synch at the same time.
-
Jessica Severin authored
-
- 14 Feb, 2005 1 commit
-
-
Jessica Severin authored
will returns 0E0 if 'zero rows are inserted' which perl intreprets are true so I need to check for it explicitly. Also store method now returns 1 on 'new insert' and '0' and 'already stored'.
-
- 10 Feb, 2005 2 commits
-
-
Jessica Severin authored
complete an analysis. If no job has been run (0 msec) it will assume 1 job per worker up to the hive_capacity (maximum parallization). Also changed worker->process_id to be the pid of the process not the ppid.
-
Jessica Severin authored
if it runs properly, the job looks like a normally claimed/fetched/run job
-