Skip to content

Performance improvements to init_pipeline.pl/generate_graph.pl

Marek Szuba requested to merge experimental/pipeconfig_load_improvements into master

Created by: muffato

My test is Compara's Protein-Tree pipeline. With the proposed changes, loading the PipeConfig file ($pipeconfig_object->add_objects_from_config($pipeline)) now takes 0.7 seconds instead of 16 seconds.

The main improvements come from:

  • storing new subs in AUTOLOAD: 16 seconds -> 4 seconds
  • caching DataflowRule::unitargets: 4 seconds -> 0.7 seconds

There are other minor improvements:

  • only stringifying the objects in verbose mode
  • keep the list of Dataflow targets locally instead of searching them in the collection

Merge request reports