Skip to content

Experimental/worker temp dir

Marek Szuba requested to merge experimental/worker_temp_dir into master

Created by: muffato

Use case

This addresses issues encountered by various teams recently regarding the temporary space allocated on the EBI farm. /tmp is smaller than what it used to be, /scratch is the advised replacement of /tmp/, and teams sometimes need a much bigger space or a shared space. Overall, this demonstrates a need to configure the default temporary directory used by Workers.

Description

Here I introduce a new command-line option for workers: worker_base_temp_dir (like worker_log_dir) to change what to use instead of /tmp. Command-line options can be set at the resource-class level, so the option can easily be applied to an analysis. A global default can also be set in the JSON configuration file.

I have removed the possibility of defining the temp directory name in the Runnable itself. If Runnables want to use a specific path, they would have to directly use the path they want, or instead override worker_temp_directory (I would advise the former). Anyway, the only Runnables that were doing that are in Compara, as far as I could see, and I'll update them. Note that this causes a breaking change to the GuestLanguage interface, which I fix here too.

I have introduced a new column in the worker table to track the temp directory actually used, which allows beekeeper to properly clean up leftover data from killed workers (during the garbage collection). I have also moved that function from Valley to Meadow because the meadow is known at that point and there is no need to query the Valley.

Possible Drawbacks

External users that have their own GuestLanguage wrappers will have to update those.

Testing

Have you added/modified unit tests to test the changes?

Yes

If so, do the tests pass/fail?

Yes

Have you run the entire test suite and no regression was detected?

Yes

Comments

As I will be away for the next three weeks, I won't be able to follow up with your review immediately. I still wanted to open a PR to give time to review the code and start discussing the changes. If you think it is appropriate, you can also tell the eHive reps about this branch as they may want to give it a try too. Finally, feel free to modify the branch yourselves if you want to quickly merge this feature.

Merge request reports