Skip to content

ensembl Utils::IO into ensembl-io

Marek Szuba requested to merge utils_io into master

Created by: tgrego

Bio::EnsEMBL::Utils::IO::* lives in the core repo at the moment. This code predates the ensembl-io code, but it sure is IO related and should probably be located there. This pull request moves that code into the ensembl-io repo, keeping the same namespace. Thus code that uses it will not need any changes except that ensembl-io will now be a dependency. This code was developed with the intention if being used internally by the production team, so it should be possible to let them know before branching of release/97.

Four filetypes were identified as being dealt with by both ensembl-io and utils-io: GTF, GFF, FASTA and BED.

Extra tests were added to check that the output of files produced by utils/io and ensembl-io are similar where they should be (t/utils_io/harmony.t). This is completed for the GTF and GFF formats (although only gene objects are tested at the moment... it's also easy to extend the test cases). BED format seems however to be written in different formats by the 2 systems, thus there is only a stud for the test and more investigation required. FASTA seems to be feature incomplete in ensembl-io (parser only), thus there is only a stud for the test and more investigation required (implementation of a FASTA writer needed?).

Utils/IO in the core repo can then possibly be deleted if this is working fine. I think the 4 release deprecation notice does not apply here as this is not really a deprecation but a move... Namespace is the same, code is the same, all should be fine with the dependencies updated.

Coordinated with this https://github.com/Ensembl/ensembl/pull/368 has been submitted.

Merge request reports