This project is mirrored from https://:*****@github.com/Ensembl/ensembl-hive.git.
Pull mirroring updated .
- Jan 13, 2005
-
-
Jessica Severin authored
Initially used to manually re-run a job with runWorker.pl -job_id
-
Jessica Severin authored
and Analysis::RunnableDB superclasses
-
Jessica Severin authored
-
- Jan 12, 2005
-
-
Jessica Severin authored
properly handle RunnableDBs that throw exceptions in the fetch_input stage.
-
- Jan 11, 2005
-
-
Jessica Severin authored
changed INSERT syntax to be more SQL compliant
-
Jessica Severin authored
changed INSERT syntax to be more SQL compliant
-
Jessica Severin authored
-
Jessica Severin authored
changed INSERT syntax to be more SQL compliant
-
Jessica Severin authored
-
Jessica Severin authored
-
- Jan 08, 2005
-
-
Abel Ureta-Vidal authored
Set the OUTPUT_AUTOFLUSH=1 to get info immediately when in -loop mode. Also print out the time when the next loop will occur
-
Abel Ureta-Vidal authored
In synchronize_AnalysisStats method, added a POSIX::ceil when setting the num_required_workers for an AnalysisStats object
-
- Jan 07, 2005
-
-
Will Spooner authored
-
- Jan 06, 2005
-
-
Jessica Severin authored
-
- Dec 14, 2004
-
-
Jessica Severin authored
-
Jessica Severin authored
-
- Dec 13, 2004
-
-
Jessica Severin authored
used to always store a full URL
-
- Dec 10, 2004
-
-
Jessica Severin authored
-
- Dec 09, 2004
-
-
Jessica Severin authored
in AnalysisStatsAdaptor::fetch_by_analysis_id
-
Jessica Severin authored
modified Bio::EnsEMBL::Analysis::stats to not do any exception catching if AnalysisStatsAdaptor->fetch_by_analysis_id fails there is something very wrong and the exception should propogate out and cause the program to fail
-
Jessica Severin authored
-
- Nov 30, 2004
-
-
Jessica Severin authored
Bio::EnsEMBL::Analysis::RunnableDB via namespace extension syntax so that hive system can use analysis.modules that inherit from Bio::EnsEMBL::Analysis::RunnableDB
-
- Nov 25, 2004
-
-
Jessica Severin authored
and gives the user a good overview of where processing is at. added -analysis_stats and -worker_stats which give full statistics on all analyses and all running workers respectfully.
-
Jessica Severin authored
'total jobs' count so one can calculate a 'progress bar'
-
Jessica Severin authored
and print_running_worker_status.
-
- Nov 24, 2004
-
-
Jessica Severin authored
-
Jessica Severin authored
-
Jessica Severin authored
used by lsf_beekeeper to decide when it needs to do a hard resync.
-
Jessica Severin authored
to run with. eg $worker->run($job); This job can be pulled from database or created on the fly. This is to accomodate debug modes of runWorker.pl
-
- Nov 22, 2004
-
-
Jessica Severin authored
-
- Nov 20, 2004
-
-
Jessica Severin authored
-
Jessica Severin authored
no longer does Queen::synchronize_hive as part of autonomous loop -sync option allows user to manually trigger a hard sync. also removed default display of full hive status and addded option -status which will print this full status. Also removed adjusting needed worker count for 'pending' workers. lsf will sometimes leave jobs in pending state for no apparent reason (new bsubed job will run yet older pending job stays pending). Current 'pending' count also didn't differentiate between lsf_beekeeper submited jobs and manually submitted jobs. This pend adjustment isn't a critcal subsystem so I've removed it for now. If a runWorker starts (after a long pend) and there is no work left it will die immeadiately. I may rewrite a smarter 'pending' adjustment in the future.
-
Jessica Severin authored
and distributed manner as it interacts with the workers over the course of its life. When a runWorker.pl script starts and asks a queen to create a worker the queen has a list of known analyses which are 'above the surface' where full hive analysis has been done and the number of needed workers has been calculated. Full synch requires joining data between the analysis, analysis_job, analysis_stats, and hive tables. When this reached 10e7 jobs, 10e4 analyses, 10e3 workers a full hard sync took minutes and it was clear this bit of the system wasn't scaling and wasn't going to make it to the next order of magnitude. This occurred in the compara blastz pipeline between mouse and rat. Now there are some analyses 'below the surface' that have partial synchronization. These analyses have been flagged as having 'x' new jobs (AnalysisJobAdaptor updating analysis_stats on job insert). If no analysis is found to asign to the newly created worker, the queen will dip below the surface and start checking the analyses with the highest probablity of needing the most workers. This incremental sync is also done in Queen::get_num_needed_workers When calculating ahead a total worker count, this routine will also dip below the surface until the hive reaches it's current defined worker saturation. A beekeeper is no longer a required component for the system to function. If workers can get onto cpus the hive will run. The beekeeper is now mainly a user display program showing the status of the hive. There is no longer any central process doing work and one hive can potentially scale beyond 10e9 jobs in graphs of 10e6 analysis nodes and 10e6 running workers.
-
- Nov 19, 2004
-
-
Jessica Severin authored
the most time since last update are at the top of the returned list
-
Jessica Severin authored
When jobs are inserted into the analysis_job table, the analysis_stats table for the given analysis is updated by incrementing the total_job_count, and unclaimed_job_count and setting the status to 'LOADING'. If the analysis is 'BLOCKED' this incremental update does not happen. When an analysis_stats is 'BLOCKED' and then unblocked this automatically will trigger a resync so this progress partial update is not needed.
-
Jessica Severin authored
sync unless some jobs have been loaded.
-
Jessica Severin authored
also changed default to 'LOADING' so that it can trigger a sync
-
- Nov 18, 2004
-
-
Jessica Severin authored
-
Jessica Severin authored
-
- Nov 17, 2004
-
-
Jessica Severin authored
than one status
-