With the introduction of lima v2.5.0, it is possible to undo all demultiplexing steps for HiFi data. For this, the bioconda package contains a new
lima movie.hifi_reads.bam demux.consensusreadset.xml --hifi-preset SYMMETRIC --store-unbarcoded lima-undo demux.consensusreadset.xml undo.bam
Let’s unroll what’s happening. In the first line, we explicitly request to store the unbarcoded reads. Without this, we would not be able to recover unbarcoded reads. The
XML contains all the file paths to the
BAM files. The second call is to the new lima-undo binary that takes a
BAM file as input and ouput.
Optionally, you can also provide multiple input
BAM files with one output
lima-undo demux.bam demux.unbarcoded.bam undo.bam
This works also with split BAM files:
lima-undo demux.bc1001-bc1001.bam demux.bc1002-bc1002.bam demux.unbarcoded.bam undo.bam
lima v2.5.0 and later stores everything that got clipped in an internal binary structure in the
ls tag. Multiple demultiplexing rounds are supported. Once lima-undo gets called, for each read the individual demultiplexing steps get reverted until the read is identical to the original HiFi read.
How to check that the result is identical:
samtools sort --no-PG -t "zm" undo.bam -o sorted.undo.bam samtools view --no-PG sorted.undo.bam > undo.sam samtools view --no-PG movie.hifi_reads.bam > original.sam diff original.sam undo.sam