sequence/proteome/:species GET endpoint added
Created by: vsitnik
sequence/proteome/:species GET endpoint added, allowing bulk/whole proteome download.
core.meta.proteome_download_allowed used as guard for vb_proteome_download.
Requirements
- Filling out the template is required. Any pull request that does not include enough information to be reviewed in a timely manner may be closed at the maintainers' discretion;
- Review the contributing guidelines for this repository; remember in particular:
- do not modify code without testing for regression
- provide simple unit tests to test the changes
- the PR must not fail unit testing
- if you're adding/updating documentation of an endpoint, make sure you add/update the necessary parameters to the (template) configuration files in the ensembl-rest_private repo
Description
New endpoint allows downloading all protein sequences for the specified species. Only species having 'true' meta.proteome_download_allowed in the core databases will be affected. For others this feature will be forbidden.
Use case
Benefits
Will be use by uniprot to download protein fastas from vectorbase.org. I.e. allows to download all 'canonical' protein sequences for Anopheles atroparvus in 2 minutes 36 seconds instead of approximately 3 hours when using current approach.
Possible Drawbacks
Still slow. Won't be appropriate for a large genomes, probably. Thus, setting meta.proteome_download_allowed should be done with cautious.
Testing
Have you added/modified unit tests to test the changes?
If so, do the tests pass/fail?
Have you run the entire test suite and no regression was detected?
Changelog
sequence/proteome/:species allows whole proteome downloads, only for subset of species.