Skip to content

Fix the code preventing insertion of duplicate dependent xrefs

Created by: mkszuba

Description

Assuming one has run BaseParser::get_dependent_mappings() for the relevant source, BaseParser::add_dependent_xref() should be able to detect existing dependent xrefs with exactly the same master and dependent xref IDs as the ones currently in use and prevent insertion of a duplicate entry. The problem is, add_dependent_xref() expects the mapping hash to track linkage information as well - which get_dependent_mappings() did not save.

Changed get_dependent_mappings() so that it stores linkage annotation where add_dependent_xref() looks for it. Tested on Mim2Gene data, confirmed that subsequent re-runs of the parser with the same input file do not add any new rows to dependent_xref.

Use case

Running xref parsers using BaseParser::add_dependent_xref() rather than custom code to insert dependent xrefs into the database.

Benefits

All such parsers will become replay-safe.

Possible Drawbacks

Loss of performance due to having to generate a mapping of existing dependent xrefs. Given mappings are generated on a per-source basis and that under normal circumstances there should be few (if any) existing dependent xrefs for the given source when the parser inserting such xrefs is run, the impact should be small.

Testing

Have you added/modified unit tests to test the changes? No.

If so, do the tests pass/fail? N/A

Have you run the entire test suite and no regression was detected? N/A. I have run a parser using the relevant bits of code (my reviewed version of Mim2GeneParser), it both works and now correctly handles duplicates.

Merge request reports