• Jessica Severin's avatar
    added hive_id index to analysis_job table to help with dead_worker · 27403dda
    Jessica Severin authored
    job reseting.  This allowed direct UPDATE..WHERE.. sql to be used.
    Also changed the retry_count system: retry_count is only incremented
    for jobs that failed (status in ('GET_INPUT','RUN','WRITE_OUTPUT')).
    Job that were CLAIMED by the dead worker are just reset without
    incrementing the retry_count since they were never attempted to run.
    Also the fetching of claimed jobs now has an 'ORDER BY retry_count'
    so that jobs that have failed are at the bottom of the list of jobs
    to process.  This allows the 'bad' jobs to filter themselves out.
    27403dda
tables.sql 9.27 KB