Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • E ensembl
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
    • Locked files
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Jira
    • Jira
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test cases
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Terraform modules
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • ensembl-gh-mirror
  • ensembl
  • Merge requests
  • !316

Bugfix/delint Mim2Gene parser

  • Review changes

  • Download
  • Patches
  • Plain diff
Merged Marek Szuba requested to merge bugfix/delint_Mim2GeneParser into feature/xref_sprint Oct 22, 2018
  • Overview 6
  • Commits 29
  • Pipelines 0
  • Changes 1

Created by: mkszuba

Warning: as of 2018-10-30, this PR is expected to fail Travis builds due to being dependent on #323 and #324 .

Description

Fixes bugs observed (so far) in Mim2GeneParser during the xref sprint, implements support for direct MIM xrefs, add additional checks, and delint the code to facilitate further refactoring. See ENSCORESW-2891.

Use case

Part of the efforts to improve the xref pipeline. Moreover, one of the observed bugs actually prevents current versions of mim2gene data from being parsed at all.

Benefits

The parser can now handle recent versions of mim2gene input. Direct xrefs are now produced wherever possible, with dependent ones only used for entries lacking Ensembl ID but with EntrezGene ID present. Use BaseParser methods for inserting dependent xrefs into the database, which in addition to avoiding hand-rolled DBI code will, once pull request #314 has been approved, prevent Mim2GeneParser from inserting duplicate entries upon re-runs with the same input. Some future-proofing. Code (hopefully) easier to maintain. Most complaints of PerlCritic levels 3 and 2 taken care of. Use a standardised rather than hand-rolled CSV parser, with potential for a performance increase if compiled rather than native-Perl version of the parser is used.

Possible Drawbacks

Output is less straightforward than it used to be because it now includes both direct (the vast majority) and dependent (around 10 percent as of today) xrefs.

Testing

Have you added/modified unit tests to test the changes? No.

If so, do the tests pass/fail? N/A

Have you run the entire test suite and no regression was detected? N/A. However, I have run the parser itself on both current DBASS data and some intentionally malformed input, and it appears to work correctly.

Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: bugfix/delint_Mim2GeneParser