DESCRIPTION
The chimera.uchime command reads a fasta file and reference file and outputs potentially chimeric sequences. The original uchime program was written by Robert C. Edgar.SYNOPSIS
uchime --input query.fasta [--db db.fasta] [--uchimeout results.uchime]- [--uchimealns results.alns]
OPTIONS
--input filename
- Query sequences in FASTA format. If the --db option is not specificed, uchime uses de novo detection. In de novo mode, relative abundance must be given by a string /ab=xxx/ somewhere in the label, where xxx is a floating-point number, e.g. >F00QGH67HG/ab=1.2/.
--db filename
- Reference database in FASTA format. Optional, if not specified uchime uses de novo mode.
- ***WARNING*** The database is searched ONLY on the plus strand. You MUST include reverse-complemented sequences in the database if you want both strands to be searched.
--abskew x
- Minimum abundance skew. Default 1.9. De novo mode only. Abundance skew is:
- min [ abund(parent1), abund(parent2) ] / abund(query).
--uchimeout filename
- Output in tabbed format with one record per query sequence. First field is score (h), second field is query label. For details, see manual.
--uchimealns filename
- Multiple alignments of query sequences to parents in humanreadable format. Alignments show columns with differences that support or contradict a chimeric model.
--minh h
- Mininum score to report chimera. Default 0.3. Values from 0.1 to 5 might be reasonable. Lower values increase sensitivity but may report more false positives. If you decrease --xn, you may need to increase --minh, and vice versa.
--mindiv div
- Minimum divergence ratio, default 0.5. Div ratio is 100% - %identity between query sequence and the closest candidate for being a parent. If you don't care about very close chimeras, then you could increase --mindiv to, say, 1.0 or 2.0, and also decrease --min h, say to 0.1, to increase sensitivity. How well this works will depend on your data. Best is to tune parameters on a good benchmark.
--xn beta
- Weight of a no vote, also called the beta parameter. Default 8.0. Decreasing this weight to around 3 or 4 may give better performance on denoised data.
--dn n
- Pseudo-count prior on number of no votes. Default 1.4. Probably no good reason to change this unless you can retune to a good benchmark for your data. Reasonable values are probably in the range from 0.2 to 2.
--xa w
- Weight of an abstain vote. Default 1. So far, results do not seem to be very sensitive to this parameter, but if you have a good training set might be worth trying. Reasonable values might range from 0.1 to 2.
--chunks n
- Number of chunks to extract from the query sequence when searching for parents. Default 4.
--[no]ovchunks
- [Do not] use overlapping chunks. Default do not.
--minchunk n
- Minimum length of a chunk. Default 64.
--idsmoothwindow w
- Length of id smoothing window. Default 32.
--minsmoothid f
- Minimum factional identity over smoothed window of candidate parent. Default 0.95.
--maxp n
- Maximum number of candidate parents to consider. Default 2. In tests so far, increasing --maxp gives only a very small improvement in sensivity but tends to increase the error rate quite a bit.
--[no]skipgaps --[no]skipgaps2
- These options control how gapped columns affect counting of diffs. If --skipgaps is specified, columns containing gaps do not found as diffs. If --skipgaps2 is specified, if column is immediately adjacent to a column containing a gap, it is not counted as a diff. Default is --skipgaps --skipgaps2.
--minlen L --maxlen L
- Minimum and maximum sequence length. Defaults 10, 10000. Applies to both query and reference sequences.
--ucl
- Use local-X alignments. Default is global-X. On tests so far, global-X is always better; this option is retained because it just might work well on some future type of data.
--queryfract f
- Minimum fraction of the query sequence that must be covered by a local-X alignment. Default 0.5. Applies only when --ucl is specified.
--quiet
- Do not display progress messages on stderr.
--log filename
- Write miscellaneous information to the log file. Mostly of interest to me (the algorithm developer). Use --verbose to get more info.
--self
- In reference database mode, exclude a reference sequence if it has the same label as the query. This is useful for benchmarking by using the ref db as a query to test for false positives.
- --abskew <float>
- help
- --absort <str>
- help
- --abx <float>
- help
- --allpairs <str>
- help
- --alpha <str>
- help
- --band <uint>
- help
- --blast6out <str>
- help
- --[no]blast_termgaps
- help
- --blastout <str>
- help
- --bump <uint>
- help
- --[no]cartoon_orfs
- help
- --cc <str>
- help
- --chain_evalue <float>
- help
- --chain_targetfract <float>
- help
- --chainhits <str>
- help
- --chainout <str>
- help
- --chunks <uint>
- help
- --clstr2uc <str>
- help
- --clump <str>
- help
- --clump2fasta <str>
- help
- --clumpfasta <str>
- help
- --clumpout <str>
- help
- --cluster <str>
- help
- --compilerinfo
- Write info about compiler types and #defines to stdout.
- --computekl <str>
- help
- --db <str>
- help
- --dbstep <uint>
- help
- --[no]denovo
- help
- --derep
- help
- --diffchar <str>
- help
- --dn <float>
- help
- --doug <str>
- help
- --droppct <uint>
- help
- --evalue <float>
- help
- --evalue_g <float>
- help
- --exact
- help
- --[no]fastalign
- help
- --fastapairs <str>
- help
- --fastq2fasta <str>
- help
- --findorfs <str>
- help
- --[no]flushuc
- help
- --frame <int>
- help
- --fspenalty <float>
- help
- --gapext <str>
- help
- --gapopen <str>
- help
- --getseqs <str>
- help
- --global
- help
- --hash
- help
- --hashsize <uint>
- help
- --help
- Display command-line options.
- --hireout <str>
- help
- --hspalpha <str>
- help
- --id <float>
- help
- --idchar <str>
- help
- --iddef <uint>
- help
- --idprefix <uint>
- help
- --ids <str>
- help
- --idsmoothwindow <uint>
- help
- --idsuffix <uint>
- help
- --indexstats <str>
- help
- --input <str>
- help
- --[no]isort
- help
- --k <uint>
- help
- --ka_dbsize <float>
- help
- --ka_gapped_k <float>
- help
- --ka_gapped_lambda <float>
- help
- --ka_ungapped_k <float>
- help
- --ka_ungapped_lambda <float>
- help
- --[no]label_ab
- help
- --labels <str>
- help
- --[no]leftjust
- help
- --lext <float>
- help
- --local
- help
- --log <str>
- Log file name.
- --[no]log_hothits
- help
- --[no]log_query
- help
- --[no]logmemgrows
- help
- --logopts
- Log options.
- --[no]logwordstats
- help
- --lopen <float>
- help
- --makeindex <str>
- help
- --match <float>
- help
- --matrix <str>
- help
- --max2 <uint>
- help
- --maxaccepts <uint>
- help
- --maxclump <uint>
- help
- --maxlen <uint>
- help
- --maxovd <uint>
- help
- --maxp <uint>
- help
- --maxpoly <uint>
- help
- --maxqgap <uint>
- help
- --maxrejects <uint>
- help
- --maxspan1 <uint>
- help
- --maxspan2 <uint>
- help
- --maxtargets <uint>
- help
- --maxtgap <uint>
- help
- --mcc <str>
- help
- --mergeclumps <str>
- help
- --mergesort <str>
- help
- --minchunk <uint>
- help
- --mincodons <uint>
- help
- --mindiffs <uint>
- help
- --mindiv <float>
- help
- --minh <float>
- help
- --minhsp <uint>
- help
- --minlen <uint>
- help
- --minorfcov <uint>
- help
- --minspanratio1 <float>
- help
- --minspanratio2 <float>
- help
- --[no]minus_frames
- help
- --mismatch <float>
- help
- --mkctest <str>
- help
- --[no]nb
- help
- --optimal
- help
- --orfstyle <uint>
- help
- --otusort <str>
- help
- --output <str>
- help
- --[no]output_rejects
- help
- --probmx <str>
- help
- --query <str>
- help
- --queryfract <float>
- help
- --querylen <uint>
- help
- --quiet
- Turn off progress messages.
- --randseed <uint>
- help
- --realign
- help
- --[no]rev
- help
- --[no]rightjust
- help
- --rowlen <uint>
- help
- --secs <uint>
- help
- --seeds <str>
- help
- --seedsout <str>
- help
- --seedt1 <float>
- help
- --seedt2 <float>
- help
- --self
- help
- --[no]selfid
- help
- --simcl <str>
- help
- --[no]skipgaps
- help
- --[no]skipgaps2
- help
- --sort <str>
- help
- --sortuc <str>
- help
- --sparsedist <str>
- help
- --sparsedistparams <str>
- help
- --split <float>
- help
- --[no]ssort
- help
- --sspenalty <float>
- help
- --[no]stable_sort
- help
- --staralign <str>
- help
- --stepwords <uint>
- help
- --strand <str>
- help
- --targetfract <float>
- help
- --targetlen <uint>
- help
- --tmpdir <str>
- help
- --[no]trace
- help
- --tracestate <str>
- help
- --[no]trunclabels
- help
- --[no]twohit
- help
- --uc <str>
- help
- --uc2clstr <str>
- help
- --uc2fasta <str>
- help
- --uc2fastax <str>
- help
- --uchime <str>
- help
- --uchimealns <str>
- help
- --uchimeout <str>
- help
- --[no]ucl
- help
- --uhire <str>
- help
- --ungapped
- help
- --userfields <str>
- help
- --userout <str>
- help
- --usersort
- help
- --uslink <str>
- help
- --[no]usort
- help
- --utax <str>
- help
- --[no]verbose
- help
- --version
- Show version and exit.
- --w <uint>
- help
- --weak_evalue <float>
- help
- --weak_id <float>
- help
- --[no]wordcountreject
- help
- --[no]wordweight
- help
- --xa <float>
- help
- --xdrop_g <float>
- help
- --xdrop_nw <float>
- help
- --xdrop_u <float>
- help
- --xdrop_ug <float>
- help
- --xframe <str>
- help
- --xlat
- help
- --xn <float>
- help