DESCRIPTION
usage: pbalign [-h] [--verbose] [--version] [--profile] [--debug]- [--regionTable REGIONTABLE] [--configFile CONFIGFILE] [--pulseFile PULSEFILE] [--algorithm {blasr,bowtie,gmap}] [--maxHits MAXHITS] [--minAnchorSize MINANCHORSIZE] [--useccs {useccs,useccsall,useccsdenovo}] [--noSplitSubreads] [--concordant] [--nproc NPROC] [--algorithmOptions ALGORITHMOPTIONS] [--maxDivergence MAXDIVERGENCE] [--minAccuracy MINACCURACY] [--minLength MINLENGTH] [--scoreFunction {alignerscore,editdist,blasrscore}] [--scoreCutoff SCORECUTOFF] [--hitPolicy {randombest,allbest,random,all,leftmost}] [--filterAdapterOnly] [--forQuiver] [--loadQVs] [--byread] [--metrics METRICS] [--seed SEED] [--tmpDir TMPDIR] inputFileName referencePath outputFileName
Mapping PacBio sequences to references using an algorithm selected from a selection of supported command-line alignment algorithms. Input can be a fasta, pls.h5, bas.h5 or ccs.h5 file or a fofn (file of file names). Output is in either cmp.h5 or sam format.
positional arguments:
- inputFileName
- The input file can be a fasta, plx.h5, bax.h5, ccs.h5 file or a fofn.
- referencePath
- Either a reference fasta file or a reference repository.
- outputFileName
- The output cmp.h5 or sam file.
optional arguments:
- -h, --help
- show this help message and exit
- --verbose, -v
- Set the verbosity level
- --version
- show program's version number and exit
- --profile
- Print runtime profile at exit
- --debug
- Catch exceptions in debugger (requires ipdb)
- --regionTable REGIONTABLE
- Specify a region table for filtering reads.
- --configFile CONFIGFILE
- Specify a set of user-defined argument values.
- --pulseFile PULSEFILE
- When input reads are in fasta format and output is a cmp.h5 this option can specify pls.h5 or bas.h5 or FOFN files from which pulse metrics can be loaded for Quiver.
- --algorithm {blasr,bowtie,gmap}
- Select an aligorithm from ('blasr', 'bowtie', 'gmap'). Default algorithm is blasr.
- --maxHits MAXHITS
- The maximum number of matches of each read to the reference sequence that will be evaluated. Default value is 10.
- --minAnchorSize MINANCHORSIZE
- The minimum anchor size defines the length of the read that must match against the reference sequence. Default value is 12.
- --useccs {useccs,useccsall,useccsdenovo}
- Map the ccsSequence to the genome first, then align subreads to the interval that the CCS reads mapped to.
- useccs: only maps subreads that span the length of
- the template.
- useccsall: maps all subreads.
- useccsdenovo: maps ccs only.
- --noSplitSubreads
- Do not split reads into subreads even if subread regions are available. Default value is False.
- --concordant
- Map subreads of a ZMW to the same genomic location.
- --nproc NPROC
- Number of threads. Default value is 8.
- --algorithmOptions ALGORITHMOPTIONS
- Pass alignment options through.
- --maxDivergence MAXDIVERGENCE
- The maximum allowed percentage divergence of a read from the reference sequence. Default value is 30.
- --minAccuracy MINACCURACY
- The minimum percentage accuracy of alignments that will be evaluated. Default value is 70.
- --minLength MINLENGTH
- The minimum aligned read length of alignments that will be evaluated. Default value is 50.
- --scoreFunction {alignerscore,editdist,blasrscore}
- Specify a score function for evaluating alignments.
- alignerscore : aligner's score in the SAM tag 'as'.
- editdist : edit distance between read and reference. blasrscore : blasr's default score function.
- Default value is alignerscore.
- --scoreCutoff SCORECUTOFF
- The worst score to output an alignment.
- --hitPolicy {randombest,allbest,random,all,leftmost}
- Specify a policy for how to treat multiple hit
- random
- : selects a random hit.
- all
- : selects all hits.
- allbest
- : selects all the best score hits.
- randombest: selects a random hit from all best score hits.
- leftmost : selects a hit which has the best score and the
- smallest mapping coordinate in any reference.
- Default value is randombest.
- --filterAdapterOnly
- If specified, do not report adapter-only hits using annotations with the reference entry.
- --forQuiver
- The output cmp.h5 file which will be sorted, loaded with pulse QV information, and repacked, so that it can be consumed by quiver directly. This requires the input file to be in PacBio bas/pls.h5 format, and --useccs must be None. Default value is False.
- --loadQVs
- Similar to --forQuiver, the only difference is that --useccs can be specified. Default value is False.
- --byread
- Load pulse information using -byread option instead of -bymetric. Only works when --forQuiver or --loadQVs are set. Default value is False.
- --metrics METRICS
- Load the specified (comma-delimited list of) metrics instead of the default metrics required by quiver. This option only works when --forQuiver or --loadQVs are set. Default: DeletionQV,DeletionTag,InsertionQV,MergeQV,SubstitutionQV
- --seed SEED
- Initialize the random number generator with a none-zero integer. Zero means that current system time is used. Default value is 1.
- --tmpDir TMPDIR
-
Specify a directory for saving temporary files.
Default is ,/scratch/.