Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • E ensembl
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Jira
    • Jira
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • ensembl-gh-mirror
  • ensembl
  • Merge requests
  • !501

Merged
Created Aug 03, 2020 by Marek Szuba@mks

Feature/xref standardisation

  • Overview 5
  • Commits 4
  • Changes 8

Created by: magaliruffier

Requirements

  • Filling out the template is required. Any pull request that does not include enough information to be reviewed in a timely manner may be closed at the maintainers' discretion;
  • Review the contributing guidelines for this repository; remember in particular:
    • do not modify code without testing for regression
    • provide simple unit tests to test the changes
    • if you change the schema you must patch the test databases as well, see Updating the schema
    • the PR must not fail unit testing

Description

The list of sources used for gene naming and gene description has been consolidated in a single place and used for most vertebrate species. This removes the need to maintain multiple lists for different species with the risk of them being out of sync. Additionally, when multiple choices are available, a re-ordering has been applied to increase the chances of making the same choice when the pipeline is re-run

Use case

Describe the problem. Please provide an example representing the motivation behind the need for having these changes in place. We have genes where the name and the description are different, because different sources were used for each. This is confusing users. We also have cases where the same gene has a different name from one release to another, because there are two choices and one is arbitrarily chosen each time

Benefits

If applicable, describe the advantages the changes will have. More consistency in the xref assignment, less queries from users, easier explanation of the name assignment strategy

Possible Drawbacks

If applicable, describe any possible undesirable consequence of the changes. None that I can see

Testing

Have you added/modified unit tests to test the changes? no test cases, but the xref pipeline has been run on a large number of species to check the results

If so, do the tests pass/fail? the pipeline runs fine and produces consistent results on the species it was tested on

Have you run the entire test suite and no regression was detected? NA

Assignee
Assign to
Reviewer
Request review from
Time tracking
Source branch: feature/xref_standardisation