Skip to content

padded_alignment() produces invalid output

Created by: AlexanderDilthey

Hi,

The padded_alignment() function sometimes produces invalid output, in that the strings that specify the pairwise alignment are not of equal length.

Interestingly, this only happens (on the same alignment) when using CRAM format -- BAM and SAM are fine.

I also get these warnings:

substr outside of string at /home/diltheyat/perl5/lib/perl5/x86_64-linux-thread-multi/Bio/DB/HTS/AlignWrapper.pm line 307. Use of uninitialized value in concatenation (.) or string at /home/diltheyat/perl5/lib/perl5/x86_64-linux-thread-multi/Bio/DB/HTS/AlignWrapper.pm line 307.

Files to reproduce this error:

Reference genome GRCh38_full_plus_hs38d1_analysis_set_minus_alts.fa: https://gembox.cbcb.umd.edu/shared/graph_hackathon/reference/GRCh38_full_plus_hs38d1_analysis_set_minus_alts.fa

CRAM/SAM with one test alignment: https://www.dropbox.com/s/2dbb413ulfrhpma/testCase.zip?dl=0

Code:

use strict; use warnings; use Data::Dumper; use Bio::DB::HTS;

my $referenceFasta = 'GRCh38_full_plus_hs38d1_analysis_set_minus_alts.fa'; my $sam = Bio::DB::HTS->new(-fasta => $referenceFasta, -bam => 'testCase.cram');

my $alignment_iterator = $sam->features(-iterator => 1); while(my $alignment = $alignment_iterator->next_seq) { my (ref,matches,$query) = $alignment->padded_alignment; unless(length(ref) == length(query)) { warn Dumper("Alignment length mismatch", length(ref), length(query), $alignment->query->name); next; } }