Created by: tgrego
- Filling out the template is required. Any pull request that does not include enough information to be reviewed in a timely manner may be closed at the maintainers' discretion;
- Review the contributing guidelines for this repository; remember in particular:
- do not modify code without testing for regression
- provide simple unit tests to test the changes
- if you change the schema you must patch the test databases as well, see Updating the schema
- the PR must not fail unit testing
Ticket from production:
Hello Core, We found few releases back that queries like the following would take forever to run for some Collection databases:
select count(distinct(protein_feature_id)) from interpro join protein_feature on (id=hit_name)
Dan Staines found out that adding a collate index on the interpro table id column fixed the issue:
alter table interpro change id id varchar(40) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL;
Because this affected the dumpings in one of the release, we manually patched the EG databases and requested a schema patch but it was never created.
So could you please create a schema patch for this?
Describe the problem. Please provide an example representing the motivation behind the need for having these changes in place.
If applicable, describe the advantages the changes will have.
If applicable, describe any possible undesirable consequence of the changes.
Have you added/modified unit tests to test the changes?
If so, do the tests pass/fail?
Pass, of course.
Have you run the entire test suite and no regression was detected?