man estimate-ngram (1): estimates n-gram language model

SYNOPSIS

estimate-ngram [Options]

DESCRIPTION

Estimates an n-gram language model by cumulating n-gram count statistics, smoothing observed counts, and building a backoff n-gram model. Parameters can be optionally tuned to optimize development set performance.

Filename argument can be an ASCII file, a compressed file (ending in .Z or .gz), or '-' to indicate stdin/stdout.

OPTIONS

-h, -help: Print this message.
-verbose <int>: Set verbosity level.
: Default: 1
-o, -order <int>: Set the n-gram order of the estimated LM.
: Default: 3
-v, -vocab <file>: Fix the vocab to only words from the specified file.
-u, -unk <boolean>: Replace all out of vocab words with <unk>.
: Default: false
-t, -text <files>: Add counts from text files.
-c, -counts <files>: Add counts from counts files.
-s, -smoothing <ML, FixKN, FixModKN, FixKN#, KN, ModKN, KN#>: Specify smoothing algorithms.
: Default: ModKN
-wf, -weight-features <features-template>: Specify n-gram weighting features.
-p, -params <file>: Set initial model params.
-oa, -opt-alg <Powell, LBFGS, LBFGSB>: Specify optimization algorithm.
: Default: Powell
-op, -opt-perp <file>: Tune params to minimize dev set perplexity.
-ow, -opt-wer <file>: Tune params to minimize lattice word error rate.
-om, -opt-margin <file>: Tune params to minimize lattice margin.
-wb, -write-binary <boolean>: Write LM/counts files in binary format.
: Default: false
-wp, -write-params <file>: Write tuned model params to file.
-wv, -write-vocab <file>: Write LM vocab to file.
-wc, -write-counts <file>: Write n-gram counts to file.
-wec, -write-eff-counts <file>: Write effective n-gram counts to file.
-wlc, -write-left-counts <file>: Write left-branching n-gram counts to file.
-wrc, -write-right-counts <file>: Write right-branching n-gram counts to file.
-wl, -write-lm <file>: Write ARPA backoff LM to file.
-ep, -eval-perp <files>: Compute test set perplexity.
-ew, -eval-wer <files>: Compute test set lattice word error rate.
-em, -eval-margin <files>: Compute test set lattice margin.

SYNOPSIS

DESCRIPTION

OPTIONS

SEE ALSO

LAST SEARCHED