Skip to content

only_proteing_coding filter added to SliceAdaptor::fetch_all

Marek Szuba requested to merge github/fork/vsitnik/vb_proteome_download into master

Created by: vsitnik

optimization for SliceAdaptor::fetch_all to get only regions, that have 'coding_cnt' attribute > 0. used to speed up the whole proteome download for vb_proteome_download projects.

Requirements

  • Filling out the template is required. Any pull request that does not include enough information to be reviewed in a timely manner may be closed at the maintainers' discretion;
  • Review the contributing guidelines for this repository; remember in particular:
    • do not modify code without testing for regression
    • provide simple unit tests to test the changes
    • if you change the schema you must patch the test databases as well, see Updating the schema
    • the PR must not fail unit testing

Description

optimization for SliceAdaptor::fetch_all to get only regions, that have 'coding_cnt' attribute > 0. used to speed up the whole proteome download for vb_proteome_download projects.

Use case

used to speed up the whole proteome download for vb_proteome_download projects. i.e. in the sequence/proteome/:species rest-api endpoint:

my ($toplevel, $ver, $non_ref, $dup, $lrg, $only_protein_coding) = ('toplevel', undef, undef, undef, undef, 1); my $slices = $adaptor->fetch_all($toplevel, $ver, $non_ref, $dup, $lrg, $only_protein_coding); Catalyst::Exception->throw("No slice found for the toplevel coord system") unless $slices;

Benefits

will speed up the whole proteome downloads for genomes, having large number contigs without any genes or without protein coding ones (i.e. partly assembled genomes)

Possible Drawbacks

The 'coding_cnt' attribute should be set for the 'seq_region_atrrib' for the filter to work.

Testing

test are added to t/sliceAdaptor.t tests are ok. no regression. 'All tests successful. Result: PASS'

Merge request reports