Skip to content

ENSCORESW-2722: add missing gene sources

Marek Szuba requested to merge bugfix/mouse_alts into release/93

Created by: magaliruffier

Requirements

  • Filling out the template is required. Any pull request that does not include enough information to be reviewed in a timely manner may be closed at the maintainers' discretion;
  • Review the contributing guidelines for this repository; remember in particular:
    • do not modify code without testing for regression
    • provide simple unit tests to test the changes
    • if you change the schema you must patch the test databases as well, see Updating the schema
    • the PR must not fail unit testing

Description

One or more sentences describing in detail the proposed changes.

The proposed change updates the list of sources known to be attached to genes, so we are not missing data for some species.

Use case

Describe the problem. Please provide an example representing the motivation behind the need for having these changes in place.

In the xref pipeline, there is a step to copy xrefs from the genes on the primary assembly to those on the alternative sequences. This uses a hard-coded list of xref sources which are allowed on gene level. It did not contain MGI, so names were not copied to alts in mouse. The list has now been updated to include all the known gene sources across all vertebrate species. This was spotted by a user who was wondering why alt alleles did not have the same gene symbol as their reference on the primary assembly, unlike human where it is done correctly. An example can be seen here: http://e92.ensembl.org/Mus_musculus/Gene/Alleles?db=core;g=ENSMUSG00000116203;r=CHR_MG3686_PATCH:111254283-111404050

Benefits

If applicable, describe the advantages the changes will have.

Mouse genes on alternative sequences will have the same name as their alt allele on the primary assembly.

Possible Drawbacks

If applicable, describe any possible undesirable consequence of the changes.

Gene names for the mouse annotation will change.

Testing

Have you added/modified unit tests to test the changes?

There is unfortunately no test suite for the xref code, but the code compiles and was run on sample data.

If so, do the tests pass/fail?

The whole pipeline was run on the latest mouse database and fixed the example reported by the user.

Have you run the entire test suite and no regression was detected?

Yes, although the test suite does not test this particular area of the code.

Merge request reports