Performance improvements to init_pipeline.pl/generate_graph.pl
Created by: muffato
My test is Compara's Protein-Tree pipeline. With the proposed changes, loading the PipeConfig file ($pipeconfig_object->add_objects_from_config($pipeline)
) now takes 0.7 seconds instead of 16 seconds.
The main improvements come from:
- storing new subs in AUTOLOAD: 16 seconds -> 4 seconds
- caching
DataflowRule::unitargets
: 4 seconds -> 0.7 seconds
There are other minor improvements:
- only stringifying the objects in verbose mode
- keep the list of Dataflow targets locally instead of searching them in the collection