Skip to content

ENSCORESW-3132: Limited implementation of Intervals with start > end …

Marek Szuba requested to merge bugfix/circular_chromosome_spanning into master

Created by: ens-bwalts

Requirements

  • Filling out the template is required. Any pull request that does not include enough information to be reviewed in a timely manner may be closed at the maintainers' discretion;
  • Review the contributing guidelines for this repository; remember in particular:
    • do not modify code without testing for regression
    • provide simple unit tests to test the changes
    • if you change the schema you must patch the test databases as well, see Updating the schema
    • the PR must not fail unit testing

Description

Changes to Bio::EnsEMBL::Utils::Interval allowing intervals with a start > end (e.g. spanning the origin of a circular chromosome). This is a quick, simple implementation that allows origin-spanning intervals to be created without throwing an exception. These intervals will identify themselves as origin-spanning, and will react sensibly if queried for overlap or relative position.

Note that the current implementation of IntevalTrees cannot handle an origin-spanning Interval. The current short-term solution is to throw an exception when an origin-spanning interval is put into one of these trees. This is sub-optimal, and the IntervalTree implementations should be updated to handle origin-spanning Intervals with urgency.

Use case

The new Mapper implementation (see PR #332 ) uses Intervals and Interval Trees that do not handle origin-spanning intervals. It is not possible to even create such an interval, so several scripts that load origin-spanning features have been breaking as a result. Although this fix is quite basic, at least features with start > end can be loaded and worked with in a limited way.

Benefits

Features with a start > end can be handled in a limited way again

Possible Drawbacks

This is a very quick and simple implementation, and focuses mostly on Intervals, not how they are handled by IntervalTrees. As a result, although origin-spanning intervals can now be created and subjected to basic queries, the IntervalTrees that handle more complex interval operations are still not able to handle these intervals. The short-term solution is to throw exceptions when an origin-spanning interval is an operand in an operation that the current IntervalTree implementations cannot handle. The IntervalTree implementations will need to be improved to handle more operations with origin-spanning Intervals soon.

Note that the C implementation of IntervalTreeMutable (called through XS) will need to be updated in a separate PR.

Testing

Have you added/modified unit tests to test the changes? Yes

If so, do the tests pass/fail? Pass

Have you run the entire test suite and no regression was detected? Yes

Merge request reports