This project is mirrored from https://:*****@github.com/Ensembl/ensembl-hive.git.
Pull mirroring updated .
- 03 Mar, 2005 3 commits
-
-
Jessica Severin authored
is calculated. If batch_size>0 use batch_size, else use avg_msec_per_job equation.
-
Jessica Severin authored
reordered where the blocking checks are done (added, deleted, moved).
-
Jessica Severin authored
-
- 02 Mar, 2005 1 commit
-
-
Jessica Severin authored
a job that has been flowed into an analysis/process
-
- 23 Feb, 2005 3 commits
-
-
Jessica Severin authored
-
Jessica Severin authored
-
Jessica Severin authored
-
- 21 Feb, 2005 1 commit
-
-
Jessica Severin authored
needed to better manage the hive system's load on the database housing all the hive related tables (in case the database is overloaded by multiple users). Added analysis_stats.sync_lock column (and correspondly in Object and Adaptor) Added Queen::safe_synchronize_AnalysisStats method which wraps over the synchronize_AnalysisStats method and does various checks and locks to ensure that only one worker is trying to do a 'synchronize' on a given analysis at any given moment. Cleaned up API between Queen/Worker so that worker only talks directly to the Queen, rather than getting the underlying database adaptor. Added analysis_job columns runtime_msec, query_count to provide more data on how the jobs hammer a database (queries/sec).
-
- 17 Feb, 2005 1 commit
-
-
Jessica Severin authored
called when worker dies to replace itself in the needed_workers count since it's decremented when it's born, and it's counted as living (and subtracted) as long as it's running. This gunarantees that another worker will quickly be created after this one dies (and not need to wait for a synch to happen)
-
- 16 Feb, 2005 4 commits
-
-
Jessica Severin authored
is when there are lots of workers 'WORKING' so as to avoid them falling over each other. The 'WORKING' state only exists in the middle of a large run. When the last worker dies, the state is 'ALL_CLAIMED' so the sync on death will happen properly. As the last pile of workers die they will all do a synch, but that's OK since the system needs to be properly synched when the last one dies since there won't be anybody left to do it. Also added 10 minute check for if already 'SYNCHING' to deal with case if worker dies in the middle of 'SYNCHING'.
-
Jessica Severin authored
-
Jessica Severin authored
so to reduce the sychronization frequency.
-
Jessica Severin authored
added check/set of status to 'SYNCHING' right before the synch procedure so as to prevent multiple workers from simultaneously trying to synch at the same time.
-
- 10 Feb, 2005 1 commit
-
-
Jessica Severin authored
complete an analysis. If no job has been run (0 msec) it will assume 1 job per worker up to the hive_capacity (maximum parallization). Also changed worker->process_id to be the pid of the process not the ppid.
-
- 11 Jan, 2005 1 commit
-
-
Jessica Severin authored
-
- 08 Jan, 2005 1 commit
-
-
Abel Ureta-Vidal authored
In synchronize_AnalysisStats method, added a POSIX::ceil when setting the num_required_workers for an AnalysisStats object
-
- 14 Dec, 2004 1 commit
-
-
Jessica Severin authored
-
- 09 Dec, 2004 1 commit
-
-
Jessica Severin authored
-
- 25 Nov, 2004 2 commits
-
-
Jessica Severin authored
'total jobs' count so one can calculate a 'progress bar'
-
Jessica Severin authored
and print_running_worker_status.
-
- 24 Nov, 2004 1 commit
-
-
Jessica Severin authored
used by lsf_beekeeper to decide when it needs to do a hard resync.
-
- 20 Nov, 2004 1 commit
-
-
Jessica Severin authored
and distributed manner as it interacts with the workers over the course of its life. When a runWorker.pl script starts and asks a queen to create a worker the queen has a list of known analyses which are 'above the surface' where full hive analysis has been done and the number of needed workers has been calculated. Full synch requires joining data between the analysis, analysis_job, analysis_stats, and hive tables. When this reached 10e7 jobs, 10e4 analyses, 10e3 workers a full hard sync took minutes and it was clear this bit of the system wasn't scaling and wasn't going to make it to the next order of magnitude. This occurred in the compara blastz pipeline between mouse and rat. Now there are some analyses 'below the surface' that have partial synchronization. These analyses have been flagged as having 'x' new jobs (AnalysisJobAdaptor updating analysis_stats on job insert). If no analysis is found to asign to the newly created worker, the queen will dip below the surface and start checking the analyses with the highest probablity of needing the most workers. This incremental sync is also done in Queen::get_num_needed_workers When calculating ahead a total worker count, this routine will also dip below the surface until the hive reaches it's current defined worker saturation. A beekeeper is no longer a required component for the system to function. If workers can get onto cpus the hive will run. The beekeeper is now mainly a user display program showing the status of the hive. There is no longer any central process doing work and one hive can potentially scale beyond 10e9 jobs in graphs of 10e6 analysis nodes and 10e6 running workers.
-
- 09 Nov, 2004 2 commits
-
-
Jessica Severin authored
-
Jessica Severin authored
The synchronization of the analysis_stat summary statistics was done by the beekeeper at the top of it's loop. For graphs with 40,000+ analyses this centralized syncing became a bottle neck. This new system allows the Queen attached to each worker process to synchronize it's analysis. Syncing happens when a worker 'checks in' and when it dies. The sync on 'check in' only updates if the stats are >60secs out of date to prevent over syncing. The beekeeper still needs to do whole system syncs when a subsection has finished and the next section needs to be 'unblocked'. For homology this will happen 2 times in a 16 hour run.
-
- 20 Oct, 2004 1 commit
-
-
Jessica Severin authored
workers can change batch_size as they run.
-
- 12 Oct, 2004 1 commit
-
-
Jessica Severin authored
on all anlyses, not just the ones with entries in the analysis_job table. New logic is also faster.
-
- 11 Aug, 2004 2 commits
-
-
Jessica Severin authored
-
Jessica Severin authored
an analysis if one of it's conditions are not fullfilled. Needed for case when system is done, and new data is flowed through the system (progressive runs).
-
- 06 Aug, 2004 1 commit
-
-
Jessica Severin authored
-
- 03 Aug, 2004 1 commit
-
-
Jessica Severin authored
created new() methods where needed, replaced throw, rearrange as needed
-
- 16 Jul, 2004 2 commits
-
-
Jessica Severin authored
-
Jessica Severin authored
on failure and retry_count>=5. Also changed Queen analysis summary to classify an analysis as 'DONE' when all jobs are either DONE or FAILED and hence allow the processing to proceed forward.
-
- 15 Jul, 2004 1 commit
-
-
Jessica Severin authored
to prevent excessive workers from saturating the system.
-
- 14 Jul, 2004 1 commit
-
-
Jessica Severin authored
setting (e.g. lsf job_id and array_index
-
- 13 Jul, 2004 3 commits
-
-
Jessica Severin authored
workers currently running
-
Jessica Severin authored
the analysis has effectively no system load and an unlimited number of workers of that analysis can run at any one time.
-
Jessica Severin authored
added get_num_needed_workers method which does a load analysis between the living workers and the workers needed to complete the available jobs. Returns a simple count which a beekeeper can use to allocate workers on computers. The workers are created without specific analyses but get assigned one by the Queen when they are created.
-
- 09 Jul, 2004 1 commit
-
-
Jessica Severin authored
Also added functionality so that runWorker can be run without specification of an analysis. The create_new_worker method now will query for a 'needed worker' analysis from the AnalysisStats adaptor when the analysis_id is undef. This simplifies the API interface between the Queen and the beekeepers. Now the beekeeper only needs to receive a count of workers. The workers can still be run with explicit analyses for testing or situations where one wants to manually control the processing. Now one can simply do bsub -JW[1-100] runWorker -url mysql://ensadmin:<pass>@ecs2:3361/compara_hive_jess_23 to create 100 workers which will become whatever analysis that needs to be done.
-
- 17 Jun, 2004 1 commit
-
-
Jessica Severin authored
(was still using the original idea of setting batch_size as constant in the RunnableDB class). Need to deprecate that idea.
-
- 16 Jun, 2004 1 commit
-
-
Jessica Severin authored
state of all jobs on the hive. added get_analysis method to AnalysisStates which loads the analysis for the analysis_id. Modified print_stats to produce a more descriptive table of stats.
-