How can I demultiplex IsoSeq data?
Even if you only want to remove IsoSeq primers, lima is the tool of choice.
- Remove all duplicate sequences.
- Annotate sequence names with a
5p
or3p
suffix. Example:>primer_5p AAGCAGTGGTATCAACGCAGAGTACATGGGG >sample_brain_3p AAGCAGTGGTATCAACGCAGAGTACCACATATCAGAGTGCG >sample_liver_3p AAGCAGTGGTATCAACGCAGAGTACACACACAGACTGTGAG
- Use the
--isoseq
mode. Run in combination with--peek-guess
to remove spurious false positive. - Output will be only different pairs with a
5p
and3p
combination:demux.primer_5p--sample_brain_3p.bam demux.primer_5p--sample_liver_3p.bam
Those options are very conservative to remove any spurious and ambiguous calls, in order to guarantee that only proper asymmetric (barcoded) primer are used in downstream analyses. Good libraries reach >75% CCS reads passing lima filters.
Demultiplexing cDNA barcoded adapters after SMRTbell adapter-level demultiplexing
Iso-Seq supports pooled cDNA barcoded analysis. If using barcoded cDNA primers after adapter-level demultiplexing, add --overwrite-biosample-names
to replace the bio sample names assigned during the first round of demultiplexing.