Skip to content
Snippets Groups Projects
This project is mirrored from https://:*****@github.com/Ensembl/ensembl-hive.git. Pull mirroring updated .
  1. Oct 16, 2009
  2. Sep 23, 2009
  3. Jul 13, 2009
  4. Apr 03, 2009
  5. Feb 15, 2009
  6. May 28, 2008
  7. Nov 16, 2007
  8. Oct 12, 2006
  9. Sep 04, 2006
  10. Jun 12, 2006
  11. Oct 01, 2005
  12. Aug 16, 2005
    • Jessica Severin's avatar
      added system for job-level blocking/unblocking. This is a very fine grain · faead1e0
      Jessica Severin authored
      control structure where a process/program has been made aware of the job(s)
      they are responsible for controlling.  This is facilited via a job url:
         mysql://ia64e:3306/jessica_compara32b_tree/analysis_job?dbID=6065355
      AnalysisJobAdptor::CreateNewJob now returns this url on job creation.
      When a job is datflowed, an array of these urls is returned (one for each rule).
      Jobs can now be dataflowed from a Process subclass with blocking enabled.
      A job can be fetched directly with one of these URLs.
      A commandline utility ehive_unblock.pl has been added to unblock a url job.
      To unblock a job do:
         Bio::EnsEMBL::Hive::URLFactory->fetch($url)->update_status('READY');
      This is primarily useful in asynchronous split process/parsing situations.
      faead1e0
  13. Aug 11, 2005
  14. Aug 09, 2005
  15. Jun 13, 2005
  16. Mar 04, 2005
  17. Mar 02, 2005
  18. Feb 23, 2005
  19. Feb 21, 2005
    • Jessica Severin's avatar
      YAHRF (Yet Another Hive ReFactor).....chapter 1 · 7675c31c
      Jessica Severin authored
      needed to better manage the hive system's load on the database housing all
      the hive related tables (in case the database is overloaded by multiple users).
      Added analysis_stats.sync_lock column (and correspondly in Object and Adaptor)
      Added Queen::safe_synchronize_AnalysisStats method which wraps over the
        synchronize_AnalysisStats method and does various checks and locks to ensure
        that only one worker is trying to do a 'synchronize' on a given analysis at
        any given moment.
      Cleaned up API between Queen/Worker so that worker only talks directly to the
        Queen, rather than getting the underlying database adaptor.
      Added analysis_job columns runtime_msec, query_count to provide more data on
        how the jobs hammer a database (queries/sec).
      7675c31c
  20. Feb 10, 2005
  21. Feb 04, 2005
    • Jessica Severin's avatar
      opps forgot to comment out debug line... · 3f5d31fb
      Jessica Severin authored
      3f5d31fb
    • Jessica Severin's avatar
      added OODB logic to analysis_job.input_id · 48c12a6e
      Jessica Severin authored
      keep analysis_job.input_id as varchar(255) to allow UNIQUE(analysis_id,input_id)
      but in adaptor added logic so that if input_id in AnalysisJob object exceeds
      the 255 char limit to store/fetch from the analysis_data table.  The input_id
      in the analysis_job table becomes '_ext_input_analysis_data_id ##' which is a unique
      internal variable to trigger the fetch routine to know to get the 'real' input_id
      from the analysis_data table.
      NO MORE 255 char limit on input_id and completely transparent to API user.
      48c12a6e
  22. Feb 01, 2005
  23. Jan 18, 2005
  24. Jan 13, 2005
  25. Jan 11, 2005
  26. Nov 22, 2004
  27. Nov 19, 2004
    • Jessica Severin's avatar
      Change for distributed smart Queen system. · c05ce49d
      Jessica Severin authored
      When jobs are inserted into the analysis_job table, the analysis_stats table
      for the given analysis is updated by incrementing the total_job_count,
      and unclaimed_job_count and setting the status to 'LOADING'.
      If the analysis is 'BLOCKED' this incremental update does not happen.
      When an analysis_stats is 'BLOCKED' and then unblocked this automatically
      will trigger a resync so this progress partial update is not needed.
      c05ce49d
  28. Nov 09, 2004
  29. Oct 20, 2004
    • Jessica Severin's avatar
      switched back to analysis_job.input_id · 77675743
      Jessica Severin authored
      changed to varchar(255) (but dropped joining to analysis_data table)
      If modules need more than 255 characters of input_id
      they can pass the anaysis_data_id via the varchar(255) : example {adid=>365902}
      77675743
  30. Oct 06, 2004
  31. Oct 05, 2004
    • Jessica Severin's avatar
      Second insert into analysis_data for job_creation added extra overhead. · f7182485
      Jessica Severin authored
      Removed select before store (made new method store_if_needed if that functionality is required by users)
      and added option in AnalysisJobAdaptor::CreateNewJob to pass input_analysis_data_id
      so if already know the CreateNewJob will be as fast as before.  Plus there are no limits on the
      size of the input_id string.
      f7182485
  32. Sep 30, 2004
    • Jessica Severin's avatar
      modified analysis_job table : replaced input_id varchar(100) with · 2be90ea9
      Jessica Severin authored
      input_analysis_data_id int(10) which joins to analysis_data table.
      added output_analysis_data_id int(10) for storing output_id
      External analysis_data.data is LongText which will allow much longer
      parameter sets to be passed around than was previously possible.
      AnalysisData will also allow processes to manually store 'other' data and
      pass it around via ID reference now.
      2be90ea9
  33. Aug 03, 2004
  34. Aug 02, 2004
  35. Jul 21, 2004
  36. Jul 16, 2004
  37. Jul 08, 2004
    • Jessica Severin's avatar
      added hive_id index to analysis_job table to help with dead_worker · 27403dda
      Jessica Severin authored
      job reseting.  This allowed direct UPDATE..WHERE.. sql to be used.
      Also changed the retry_count system: retry_count is only incremented
      for jobs that failed (status in ('GET_INPUT','RUN','WRITE_OUTPUT')).
      Job that were CLAIMED by the dead worker are just reset without
      incrementing the retry_count since they were never attempted to run.
      Also the fetching of claimed jobs now has an 'ORDER BY retry_count'
      so that jobs that have failed are at the bottom of the list of jobs
      to process.  This allows the 'bad' jobs to filter themselves out.
      27403dda
  38. Jun 16, 2004