Skip to content

ENSCORESW-3458: parse and store UniProt accession version

Marek Szuba requested to merge feature/uniprot_version into master

Created by: magaliruffier

Requirements

  • Filling out the template is required. Any pull request that does not include enough information to be reviewed in a timely manner may be closed at the maintainers' discretion;
  • Review the contributing guidelines for this repository; remember in particular:
    • do not modify code without testing for regression
    • provide simple unit tests to test the changes
    • if you change the schema you must patch the test databases as well, see Updating the schema
    • the PR must not fail unit testing

Description

Using one or more sentences, describe in detail the proposed changes. The proposed change captures the version of the UniProt accession processed when parsing the input file and stores it in the intermediate xref database

Use case

Describe the problem. Please provide an example representing the motivation behind the need for having these changes in place. In the Ensembl browser, the xrefs are linked to the external resource they were extracted from. Due to different release cycles, the data we processed for a release can be out of sync with the external data when it is published. Providing the version of the data processed means we can more easily identify where discrepancies come from mismatched versions

Benefits

If applicable, describe the advantages the changes will have. Better reproducibility of the xref mapping and identification of mismatches More accurate information provided on the browser

Possible Drawbacks

If applicable, describe any possible undesirable consequence of the changes. Highlights discrepancies between data sources due to different release cycles

Testing

Have you added/modified unit tests to test the changes? No test cases exist for the UniProtParser The modified code was used on a full run of the xref pipeline and the results visualised in a browser sandbox

If so, do the tests pass/fail? On the sandbox, links to UniProt in the General Identifiers and External References (eg http://www.ensembl.org/Homo_sapiens/Gene/Matches?db=core;g=ENSG00000224383;r=17:63992802-64038237;t=ENST00000425164) contain a versioned UniProt accession

Have you run the entire test suite and no regression was detected? NA

Merge request reports