padded_alignment() produces invalid output
Created by: AlexanderDilthey
Hi,
The padded_alignment() function sometimes produces invalid output, in that the strings that specify the pairwise alignment are not of equal length.
Interestingly, this only happens (on the same alignment) when using CRAM format -- BAM and SAM are fine.
I also get these warnings:
substr outside of string at /home/diltheyat/perl5/lib/perl5/x86_64-linux-thread-multi/Bio/DB/HTS/AlignWrapper.pm line 307. Use of uninitialized value in concatenation (.) or string at /home/diltheyat/perl5/lib/perl5/x86_64-linux-thread-multi/Bio/DB/HTS/AlignWrapper.pm line 307.
Files to reproduce this error:
Reference genome GRCh38_full_plus_hs38d1_analysis_set_minus_alts.fa: https://gembox.cbcb.umd.edu/shared/graph_hackathon/reference/GRCh38_full_plus_hs38d1_analysis_set_minus_alts.fa
CRAM/SAM with one test alignment: https://www.dropbox.com/s/2dbb413ulfrhpma/testCase.zip?dl=0
Code:
use strict; use warnings; use Data::Dumper; use Bio::DB::HTS;
my $referenceFasta = 'GRCh38_full_plus_hs38d1_analysis_set_minus_alts.fa'; my $sam = Bio::DB::HTS->new(-fasta => $referenceFasta, -bam => 'testCase.cram');
my $alignment_iterator = $sam->features(-iterator => 1);
while(my $alignment = $alignment_iterator->next_seq)
{
my (ref,
matches,$query) = $alignment->padded_alignment;
unless(length(ref) == length(
query))
{
warn Dumper("Alignment length mismatch", length(ref), length(
query), $alignment->query->name);
next;
}
}