[ENSCORESW-634]. Memory efficient version for writing EMBL format files.
Slice sequence is written into chunks (as specified by the chunk_factor parameter given to the constructor) in method write_embl_seq, which now returns the base counts. Methods dump_embl has been modified accordingly. A template for a sequence header is written, then the sequence, then we seek backwards to the position of the sequence header to write it with the actual base counts which have been figured out during the writing of the sequence. Pros: memory efficiency Cons: cannot dump to a file for which a compressed fh has been obtained, i.e. cannot call dump_embl method in a callback provided to Bio::EnsEMBL::Utils::IO::gz_work_with_file.
Please register or sign in to comment