Skip to content
  • Matthieu Muffato's avatar
    New Meadow/Valley interface that detects slow workers and mark them as pending · 14fcdaa5
    Matthieu Muffato authored
    This fixes @danstaines's issue:
    [Feb 11th at 09:49]
    I've got a problem with hive where I supply a registry, and the registry takes
    a couple of minutes to load (this is normal with 40k bacteria unfortunately).
    
    The registry takes a lot of time to load, so the worker is running from the LSF
    point-of-view but is not registered in the database yet (because the DBAdaptor
    object is not ready yet). Because of the discrepancy, beekeeper thinks that
    nothing is running and keep on submitting new workers (beyond the analysis
    capacity)
    
    My fix is to basically count those workers as pending. Because the
    Valley/Meadow had a method to get the running workers and another one to get
    the pending workers, I had to unify them into a single method which saves us
    one call to _bjobs_. Then the Valley adjusts the counts into a complex hash
    that contains accurate information about all the workers. This super-hash is
    then carried around in various places so that all the decisions are consistent
    to each other. Overall the Valley now does more things and the Meadows are a
    bit shorter
    
    The test-suite has been updated accordingly
    14fcdaa5