man uchime (1): reads a fasta file and reference file and outputs potentially chimeric sequences

DESCRIPTION

The chimera.uchime command reads a fasta file and reference file and outputs potentially chimeric sequences. The original uchime program was written by Robert C. Edgar.

SYNOPSIS

uchime --input query.fasta [--db db.fasta] [--uchimeout results.uchime]

: [--uchimealns results.alns]

OPTIONS

--input filename

: Query sequences in FASTA format. If the --db option is not specificed, uchime uses de novo detection. In de novo mode, relative abundance must be given by a string /ab=xxx/ somewhere in the label, where xxx is a floating-point number, e.g. >F00QGH67HG/ab=1.2/.

--db filename

: Reference database in FASTA format. Optional, if not specified uchime uses de novo mode.
: ***WARNING*** The database is searched ONLY on the plus strand. You MUST include reverse-complemented sequences in the database if you want both strands to be searched.

--abskew x

: Minimum abundance skew. Default 1.9. De novo mode only. Abundance skew is:
: min [ abund(parent1), abund(parent2) ] / abund(query).

--uchimeout filename

: Output in tabbed format with one record per query sequence. First field is score (h), second field is query label. For details, see manual.

--uchimealns filename

: Multiple alignments of query sequences to parents in humanreadable format. Alignments show columns with differences that support or contradict a chimeric model.

--minh h

: Mininum score to report chimera. Default 0.3. Values from 0.1 to 5 might be reasonable. Lower values increase sensitivity but may report more false positives. If you decrease --xn, you may need to increase --minh, and vice versa.

--mindiv div

: Minimum divergence ratio, default 0.5. Div ratio is 100% - %identity between query sequence and the closest candidate for being a parent. If you don't care about very close chimeras, then you could increase --mindiv to, say, 1.0 or 2.0, and also decrease --min h, say to 0.1, to increase sensitivity. How well this works will depend on your data. Best is to tune parameters on a good benchmark.

--xn beta

: Weight of a no vote, also called the beta parameter. Default 8.0. Decreasing this weight to around 3 or 4 may give better performance on denoised data.

--dn n

: Pseudo-count prior on number of no votes. Default 1.4. Probably no good reason to change this unless you can retune to a good benchmark for your data. Reasonable values are probably in the range from 0.2 to 2.

--xa w

: Weight of an abstain vote. Default 1. So far, results do not seem to be very sensitive to this parameter, but if you have a good training set might be worth trying. Reasonable values might range from 0.1 to 2.

--chunks n

: Number of chunks to extract from the query sequence when searching for parents. Default 4.

--[no]ovchunks

: [Do not] use overlapping chunks. Default do not.

--minchunk n

: Minimum length of a chunk. Default 64.

--idsmoothwindow w

: Length of id smoothing window. Default 32.

--minsmoothid f

: Minimum factional identity over smoothed window of candidate parent. Default 0.95.

--maxp n

: Maximum number of candidate parents to consider. Default 2. In tests so far, increasing --maxp gives only a very small improvement in sensivity but tends to increase the error rate quite a bit.

--[no]skipgaps --[no]skipgaps2

: These options control how gapped columns affect counting of diffs. If --skipgaps is specified, columns containing gaps do not found as diffs. If --skipgaps2 is specified, if column is immediately adjacent to a column containing a gap, it is not counted as a diff. Default is --skipgaps --skipgaps2.

--minlen L --maxlen L

: Minimum and maximum sequence length. Defaults 10, 10000. Applies to both query and reference sequences.

--ucl

: Use local-X alignments. Default is global-X. On tests so far, global-X is always better; this option is retained because it just might work well on some future type of data.

--queryfract f

: Minimum fraction of the query sequence that must be covered by a local-X alignment. Default 0.5. Applies only when --ucl is specified.

--quiet

: Do not display progress messages on stderr.

--log filename

: Write miscellaneous information to the log file. Mostly of interest to me (the algorithm developer). Use --verbose to get more info.

--self

: In reference database mode, exclude a reference sequence if it has the same label as the query. This is useful for benchmarking by using the ref db as a query to test for false positives.
--abskew <float>: help
--absort <str>: help
--abx <float>: help
--allpairs <str>: help
--alpha <str>: help
--band <uint>: help
--blast6out <str>: help
--[no]blast_termgaps: help
--blastout <str>: help
--bump <uint>: help
--[no]cartoon_orfs: help
--cc <str>: help
--chain_evalue <float>: help
--chain_targetfract <float>: help
--chainhits <str>: help
--chainout <str>: help
--chunks <uint>: help
--clstr2uc <str>: help
--clump <str>: help
--clump2fasta <str>: help
--clumpfasta <str>: help
--clumpout <str>: help
--cluster <str>: help
--compilerinfo: Write info about compiler types and #defines to stdout.
--computekl <str>: help
--db <str>: help
--dbstep <uint>: help
--[no]denovo: help
--derep: help
--diffchar <str>: help
--dn <float>: help
--doug <str>: help
--droppct <uint>: help
--evalue <float>: help
--evalue_g <float>: help
--exact: help
--[no]fastalign: help
--fastapairs <str>: help
--fastq2fasta <str>: help
--findorfs <str>: help
--[no]flushuc: help
--frame <int>: help
--fspenalty <float>: help
--gapext <str>: help
--gapopen <str>: help
--getseqs <str>: help
--global: help
--hash: help
--hashsize <uint>: help
--help: Display command-line options.
--hireout <str>: help
--hspalpha <str>: help
--id <float>: help
--idchar <str>: help
--iddef <uint>: help
--idprefix <uint>: help
--ids <str>: help
--idsmoothwindow <uint>: help
--idsuffix <uint>: help
--indexstats <str>: help
--input <str>: help
--[no]isort: help
--k <uint>: help
--ka_dbsize <float>: help
--ka_gapped_k <float>: help
--ka_gapped_lambda <float>: help
--ka_ungapped_k <float>: help
--ka_ungapped_lambda <float>: help
--[no]label_ab: help
--labels <str>: help
--[no]leftjust: help
--lext <float>: help
--local: help
--log <str>: Log file name.
--[no]log_hothits: help
--[no]log_query: help
--[no]logmemgrows: help
--logopts: Log options.
--[no]logwordstats: help
--lopen <float>: help
--makeindex <str>: help
--match <float>: help
--matrix <str>: help
--max2 <uint>: help
--maxaccepts <uint>: help
--maxclump <uint>: help
--maxlen <uint>: help
--maxovd <uint>: help
--maxp <uint>: help
--maxpoly <uint>: help
--maxqgap <uint>: help
--maxrejects <uint>: help
--maxspan1 <uint>: help
--maxspan2 <uint>: help
--maxtargets <uint>: help
--maxtgap <uint>: help
--mcc <str>: help
--mergeclumps <str>: help
--mergesort <str>: help
--minchunk <uint>: help
--mincodons <uint>: help
--mindiffs <uint>: help
--mindiv <float>: help
--minh <float>: help
--minhsp <uint>: help
--minlen <uint>: help
--minorfcov <uint>: help
--minspanratio1 <float>: help
--minspanratio2 <float>: help
--[no]minus_frames: help
--mismatch <float>: help
--mkctest <str>: help
--[no]nb: help
--optimal: help
--orfstyle <uint>: help
--otusort <str>: help
--output <str>: help
--[no]output_rejects: help
--probmx <str>: help
--query <str>: help
--queryfract <float>: help
--querylen <uint>: help
--quiet: Turn off progress messages.
--randseed <uint>: help
--realign: help
--[no]rev: help
--[no]rightjust: help
--rowlen <uint>: help
--secs <uint>: help
--seeds <str>: help
--seedsout <str>: help
--seedt1 <float>: help
--seedt2 <float>: help
--self: help
--[no]selfid: help
--simcl <str>: help
--[no]skipgaps: help
--[no]skipgaps2: help
--sort <str>: help
--sortuc <str>: help
--sparsedist <str>: help
--sparsedistparams <str>: help
--split <float>: help
--[no]ssort: help
--sspenalty <float>: help
--[no]stable_sort: help
--staralign <str>: help
--stepwords <uint>: help
--strand <str>: help
--targetfract <float>: help
--targetlen <uint>: help
--tmpdir <str>: help
--[no]trace: help
--tracestate <str>: help
--[no]trunclabels: help
--[no]twohit: help
--uc <str>: help
--uc2clstr <str>: help
--uc2fasta <str>: help
--uc2fastax <str>: help
--uchime <str>: help
--uchimealns <str>: help
--uchimeout <str>: help
--[no]ucl: help
--uhire <str>: help
--ungapped: help
--userfields <str>: help
--userout <str>: help
--usersort: help
--uslink <str>: help
--[no]usort: help
--utax <str>: help
--[no]verbose: help
--version: Show version and exit.
--w <uint>: help
--weak_evalue <float>: help
--weak_id <float>: help
--[no]wordcountreject: help
--[no]wordweight: help
--xa <float>: help
--xdrop_g <float>: help
--xdrop_nw <float>: help
--xdrop_u <float>: help
--xdrop_ug <float>: help
--xframe <str>: help
--xlat: help
--xn <float>: help

AUTHOR

Robert C. Edgar

DESCRIPTION

SYNOPSIS

OPTIONS

AUTHOR

LAST SEARCHED