This project is mirrored from https://github.com/Ensembl/ensembl-hive.git. Pull mirroring updated .
  1. 02 Mar, 2005 1 commit
  2. 23 Feb, 2005 1 commit
  3. 21 Feb, 2005 1 commit
    • Jessica Severin's avatar
      YAHRF (Yet Another Hive ReFactor).....chapter 1 · 7675c31c
      Jessica Severin authored
      needed to better manage the hive system's load on the database housing all
      the hive related tables (in case the database is overloaded by multiple users).
      Added analysis_stats.sync_lock column (and correspondly in Object and Adaptor)
      Added Queen::safe_synchronize_AnalysisStats method which wraps over the
        synchronize_AnalysisStats method and does various checks and locks to ensure
        that only one worker is trying to do a 'synchronize' on a given analysis at
        any given moment.
      Cleaned up API between Queen/Worker so that worker only talks directly to the
        Queen, rather than getting the underlying database adaptor.
      Added analysis_job columns runtime_msec, query_count to provide more data on
        how the jobs hammer a database (queries/sec).
      7675c31c
  4. 10 Feb, 2005 1 commit
  5. 04 Feb, 2005 2 commits
    • Jessica Severin's avatar
      opps forgot to comment out debug line... · 3f5d31fb
      Jessica Severin authored
      3f5d31fb
    • Jessica Severin's avatar
      added OODB logic to analysis_job.input_id · 48c12a6e
      Jessica Severin authored
      keep analysis_job.input_id as varchar(255) to allow UNIQUE(analysis_id,input_id)
      but in adaptor added logic so that if input_id in AnalysisJob object exceeds
      the 255 char limit to store/fetch from the analysis_data table.  The input_id
      in the analysis_job table becomes '_ext_input_analysis_data_id ##' which is a unique
      internal variable to trigger the fetch routine to know to get the 'real' input_id
      from the analysis_data table.
      NO MORE 255 char limit on input_id and completely transparent to API user.
      48c12a6e
  6. 01 Feb, 2005 1 commit
  7. 18 Jan, 2005 2 commits
  8. 13 Jan, 2005 1 commit
  9. 11 Jan, 2005 1 commit
  10. 22 Nov, 2004 1 commit
  11. 19 Nov, 2004 1 commit
    • Jessica Severin's avatar
      Change for distributed smart Queen system. · c05ce49d
      Jessica Severin authored
      When jobs are inserted into the analysis_job table, the analysis_stats table
      for the given analysis is updated by incrementing the total_job_count,
      and unclaimed_job_count and setting the status to 'LOADING'.
      If the analysis is 'BLOCKED' this incremental update does not happen.
      When an analysis_stats is 'BLOCKED' and then unblocked this automatically
      will trigger a resync so this progress partial update is not needed.
      c05ce49d
  12. 09 Nov, 2004 1 commit
  13. 20 Oct, 2004 1 commit
    • Jessica Severin's avatar
      switched back to analysis_job.input_id · 77675743
      Jessica Severin authored
      changed to varchar(255) (but dropped joining to analysis_data table)
      If modules need more than 255 characters of input_id
      they can pass the anaysis_data_id via the varchar(255) : example {adid=>365902}
      77675743
  14. 06 Oct, 2004 1 commit
  15. 05 Oct, 2004 1 commit
    • Jessica Severin's avatar
      Second insert into analysis_data for job_creation added extra overhead. · f7182485
      Jessica Severin authored
      Removed select before store (made new method store_if_needed if that functionality is required by users)
      and added option in AnalysisJobAdaptor::CreateNewJob to pass input_analysis_data_id
      so if already know the CreateNewJob will be as fast as before.  Plus there are no limits on the
      size of the input_id string.
      f7182485
  16. 30 Sep, 2004 1 commit
    • Jessica Severin's avatar
      modified analysis_job table : replaced input_id varchar(100) with · 2be90ea9
      Jessica Severin authored
      input_analysis_data_id int(10) which joins to analysis_data table.
      added output_analysis_data_id int(10) for storing output_id
      External analysis_data.data is LongText which will allow much longer
      parameter sets to be passed around than was previously possible.
      AnalysisData will also allow processes to manually store 'other' data and
      pass it around via ID reference now.
      2be90ea9
  17. 03 Aug, 2004 1 commit
  18. 02 Aug, 2004 1 commit
  19. 21 Jul, 2004 1 commit
  20. 16 Jul, 2004 1 commit
  21. 08 Jul, 2004 1 commit
    • Jessica Severin's avatar
      added hive_id index to analysis_job table to help with dead_worker · 27403dda
      Jessica Severin authored
      job reseting.  This allowed direct UPDATE..WHERE.. sql to be used.
      Also changed the retry_count system: retry_count is only incremented
      for jobs that failed (status in ('GET_INPUT','RUN','WRITE_OUTPUT')).
      Job that were CLAIMED by the dead worker are just reset without
      incrementing the retry_count since they were never attempted to run.
      Also the fetching of claimed jobs now has an 'ORDER BY retry_count'
      so that jobs that have failed are at the bottom of the list of jobs
      to process.  This allows the 'bad' jobs to filter themselves out.
      27403dda
  22. 16 Jun, 2004 1 commit
  23. 09 Jun, 2004 1 commit
  24. 07 Jun, 2004 1 commit
    • Jessica Severin's avatar
      complete switch over to new DataflowRule design. Dataflow rules use · e45d4761
      Jessica Severin authored
      URL's to specify analysis objects from mysql databases distributed
      across a network.  AnalysisJobAdaptor was switched to create jobs with
      a cless method that gets the db connection from the analysis object that
      is passed.  Thus the system now exists in a distributed state.
      The dataflow rule also implements branching via the branch_code.
      SimpleRule will be deprecated.
      e45d4761
  25. 04 Jun, 2004 1 commit
  26. 02 Jun, 2004 1 commit
  27. 27 May, 2004 1 commit
  28. 25 May, 2004 1 commit