man hmm2calibrate (1): calibrate HMM search statistics

SYNOPSIS

hmm2calibrate [options] hmmfile

DESCRIPTION

hmm2calibrate reads an HMM file from hmmfile, scores a large number of synthesized random sequences with it, fits an extreme value distribution (EVD) to the histogram of those scores, and re-saves hmmfile now including the EVD parameters.

hmm2calibrate may take several minutes (or longer) to run. While it is running, a temporary file called hmmfile.xxx is generated in your working directory. If you abort hmm2calibrate prematurely (ctrl-C, for instance), your original hmmfile will be untouched, and you should delete the hmmfile.xxx temporary file.

OPTIONS

-h: Print brief help; includes version number and summary of all options, including expert options.

EXPERT OPTIONS

--cpu <n>

Sets the maximum number of CPUs that the program will run on. The default is to use all CPUs in the machine. Overrides the HMMER_NCPU environment variable. Only affects threaded versions of HMMER (the default on most systems).

--fixed <n>

Fix the length of the random sequences to <n>, where <n> is a positive (and reasonably sized) integer. The default is instead to generate sequences with a variety of different lengths, controlled by a Gaussian (normal) distribution.

--histfile <f>

Save a histogram of the scores and the fitted theoretical curve to file <f>.

--mean <x>

Set the mean length of the synthetic sequences to <x>, where <x> is a positive real number. The default is 350.

--num <n>

Set the number of synthetic sequences to <n>, where <n> is a positive integer. If <n> is less than about 1000, the fit to the EVD may fail. Higher numbers of <n> will give better determined EVD parameters. The default is 5000; it was empirically chosen as a tradeoff between accuracy and computation time.

--pvm

Run on a Parallel Virtual Machine (PVM). The PVM must already be running. The client program hmm2calibrate-pvm must be installed on all the PVM nodes. Optional PVM support must have been compiled into HMMER.

--sd <x>

Set the standard deviation of the synthetic sequence length distribution to <x>, where <x> is a positive real number. The default is 350. Note that the Gaussian is left-truncated so that no sequences have lengths <= 0.

--seed <n>

Set the random seed to <n>, where <n> is a positive integer. The default is to use time() to generate a different seed for each run, which means that two different runs of hmm2calibrate on the same HMM will give slightly different results. You can use this option to generate reproducible results for different hmm2calibrate runs on the same HMM.

COPYRIGHT

Copyright (C) 1992-2003 HHMI/Washington University School of Medicine.
Freely distributed under the GNU General Public License (GPL).

See the file COPYING in your distribution for details on redistribution conditions.

AUTHOR

Sean Eddy
HHMI/Dept. of Genetics
Washington Univ. School of Medicine
4566 Scott Ave.
St Louis, MO 63110 USA
http://www.genetics.wustl.edu/eddy/