EntrezGene/HPA parsers update
Created by: avullo
Requirements
- Filling out the template is required. Any pull request that does not include enough information to be reviewed in a timely manner may be closed at the maintainers' discretion;
- Review the contributing guidelines for this repository; remember in particular:
- do not modify code without testing for regression
- provide simple unit tests to test the changes
- if you change the schema you must patch the test databases as well, see Updating the schema
- the PR must not fail unit testing
Description
Refactoring the parsers to consider:
- consistent error handling
- code compression, clarity
- NULL fields where applicable without touching BaseParser at the moment, i.e. forcing NULL description when adding xrefs in HPA parser
Use case
Xref pipeline for species with EntrezGene/HPA sources
Benefits
Code quality improvement
Possible Drawbacks
According to the guidelines, not fully there yet. Need to change the BaseParser and schema to force NULL info_text instead of empty string.
Testing
No unit tests at the moment I'm afraid. Run the xref_parser script with the current version and proposed update and found no difference except attribute description is now NULL for xrefs from HPA (formerly '').