pocketsphinx_continuous(1) Run speech recognition in continuous listening mode


pocketsphinx_continuous -hmm hmmdir -dict dictfile [ options ]...


This program opens the audio device and waits for speech. When it detects an utterance, it performs speech recognition on it.

Size of audio file header in bytes (headers are ignored)
Input is raw audio data
Automatic gain control for c0 ('max', 'emax', 'noise', or 'none')
Initial threshold for automatic gain control
phoneme decoding with phonetic lm
Perform phoneme decoding with phonetic lm and context-independent units only
Preemphasis parameter
file giving extra arguments.
Inverse of acoustic model scale for confidence score calculation
Inverse weight applied to acoustic scores.
Print results and backtraces to log file.
Beam width applied to every frame in Viterbi search (smaller values mean wider beam)
Run bestpath (Dijkstra) search over word lattice (3rd pass)
Language model probability weight for bestpath search
Create missing subdirectories in output directory
files directory (prefixed to filespecs in control file)
Input files extension (suffixed to filespecs in control file)
Number of components in the input feature vector
Cepstral mean normalization scheme ('current', 'prior', or 'none')
Initial values (comma-separated) for cepstral mean when 'prior' is used
Compute all senone scores in every frame (can be faster when there are many senones)
file listing utterances to be processed
No. of utterances to be processed (after skipping -ctloffset entries)
Do every Nth line in the control file
No. of utterances at the beginning of -ctl file to be skipped
output in CTM file format (may require post-sorting)
level for debugging messages
pronunciation dictionary (lexicon) input file
Dictionary is case sensitive (NOTE: case insensitivity applies to ASCII characters only)
Add 1/2-bit noise
Use double bandwidth filters (same center freq)
Frame GMM computation downsampling ratio
word pronunciation dictionary input file
Feature stream type, depends on the acoustic model
containing feature extraction parameters.
Filler word transition probability
Frame rate
format finite state grammar file
file listing FSG file to use for each utterance
directory for FSG files
extension for FSG files (including leading dot)
Add alternate pronunciations to FSG
Insert filler words at each state.
Run forward flat-lexicon search over word lattice (2nd pass)
Beam width applied to every frame in second-pass flat search
Minimum number of end frames for a word to be searched in fwdflat search
Language model probability weight for flat lexicon (2nd pass) decoding
Window of frames in lattice to search for successor words in fwdflat search
Beam width applied to word exits in second-pass flat search
Run forward lexicon-tree search (1st pass)
containing acoustic model files.
output file name
output with segmentation file name
Endianness of input data, big or little, ignored if NIST or MS Wav
grammar file
to spot
file with keyphrases to spot, one per line
Delay to wait for best detection score
Phone loop probability for keyword spotting
Threshold for p(hyp)/p(alternatives) ratio
Initial backpointer table size
containing transformation matrix to be applied to features (single-stream features only)
Dimensionality of output of feature transformation (0 to use entire matrix)
Length of sin-curve for liftering, or 0 for no liftering.
trigram language model input file
a set of language model

The -hmm and -dict arguments are always required. Either -lm or -fsg is required, depending on whether you are using a statistical language model or a finite-state grammar.


Written by numerous people at CMU from 1994 onwards. This manual page by David Huggins-Daines <[email protected]>


Copyright © 1994-2007 Carnegie Mellon University. See the file COPYING included with this package for more information.