How to run

Notes:

Any existing output files will be overwritten after execution.
Always use --peek-guess to remove spurious barcode hits.

Run on CLR subread data:

$ lima <movie>.subreads.bam <barcodes>.fasta <demux>.bam
$ lima <movie>.subreadset.xml <barcodes>.barcodeset.xml <demux>.subreadset.xml

Run on CCS / HiFi data:

$ lima <movie>.ccs.bam <barcodes>.fasta <demux>.bam
$ lima <movie>.consensusreadset.xml <barcodes>.barcodeset.xml <demux>.consensusreadset.xml

Symmetric or Tailed options

CLR: --same
CCS: --preset-hifi SYMMETRIC

Asymmetric options

CLR: --different
CCS: --preset-hifi ASYMMETRIC

Example execution

$ lima m54317_180718_075644.subreadset.xml Sequel_RSII_384_barcodes_v1.barcodeset.xml \
       m54317_180718_075644.demux.subreadset.xml --different --peek-guess

Workflow

Lima processes input reads grouped by ZMW, except if --per-read is chosen. All barcode regions along the read are processed individually. The final per-ZMW result is a summary over all barcode regions, a pair of selected barcodes from the provided set of candidate barcodes; subreads from the same ZMW will have the same barcode and barcode quality. For a particular target barcode region, every barcode sequence gets aligned as given and as reverse-complement, and higher scoring orientation is chosen; the result is a list of scores over all candidate barcodes.

If only same barcode pairs are of interest, symmetric/tailed, please use --same to filter out different barcode pairs.

If only different barcode pairs are of interest, asymmetric, please use --different to require at least two barcodes to be read and remove pairs with the same barcode.