Skip to content

Bugfix/reactome acc

Marek Szuba requested to merge bugfix/reactome_acc into master

Created by: magaliruffier

Requirements

  • Filling out the template is required. Any pull request that does not include enough information to be reviewed in a timely manner may be closed at the maintainers' discretion;
  • Review the contributing guidelines for this repository; remember in particular:
    • do not modify code without testing for regression
    • provide simple unit tests to test the changes
    • if you change the schema you must patch the test databases as well, see Updating the schema
    • the PR must not fail unit testing

Description

The description field is compared to a regex and if the description does not match, it is skipped

Use case

Some Reactome descriptions have characters that do not comply with MySQL ASCII. This regex will skip these. An example of skipped description is 'Biosynthesis of electrophilic ω-3 PUFA oxo-derivatives', because of the greek omega (ω)

Benefits

The pipeline does not fail for everything because some characters in the Reactome source are not compatible with MySQL

Possible Drawbacks

The regex is not comprehensive, so it can skip some descriptions we have not yet accounted for.

Testing

Have you added/modified unit tests to test the changes?

The xref pipeline does not have any tests, so the test suite has not been changed. The parser has been tested on the latest Reactome file though. 1648 entries were skipped out of the 1077438 lines in the file. These entries correspond to two distinct descriptions, 'Biosynthesis of electrophilic ω-3 PUFA oxo-derivatives' and 'Loss of proteins required for interphase microtubule organization from the centrosome'

Merge request reports