phonetisaurus-arpa2fst(1) ARPA LM to FST conversion tool


phonetisaurus-arpa2fst --input=arpa.lm --prefix=output_prefix [OPTIONS]



This tool converts an ARPA language model into a weighted finite transducer that can be used with phonetisaurus-g2p.


--help=<bool> (default: false)
show usage information
--helpshort=<bool> (default: false)
show brief usage information
--tmpdir=<string> (default: "/tmp/")
temporary directory
--v=<int32> (default: 0)
verbose level
--fst_align=<bool> (default: false)
Write FST data aligned where appropriate
--fst_default_cache_gc=<bool> (default: true)
Enable garbage collection of cache
--fst_default_cache_gc_limit=<int64> (default: 1048576)
Cache byte size that triggers garbage collection
--fst_verify_properties=<bool> (default: false)
Verify fst properties queried by TestProperties
--fst_weight_parentheses=<string> (default: "")
Characters enclosing the first weight of a printed composite weight (e.g. pair weight, tuple weight and derived classes) to ensure proper I/O of nested composite weights; must have size 0 (none) or 2 (open and close parenthesis)
--fst_weight_separator=<string> (default: ",")
Character separator between printed composite weights; must be a single character
--save_relabel_ipairs=<string> (default: "")
Save input relabel pairs to file
--save_relabel_opairs=<string> (default: "")
Save output relabel pairs to file
--delim=<string> (default: "}")
Delimiter used to separate input and output tokens.
--eps=<string> (default: "<eps>")
Epsilon symbol.
--input=<string> (default: "")
Input ARPA-format LM.
--null_sep=<string> (default: "_")
Graphemic null symbol.
--phi=<string> (default: "<phi>")
Optional Phi (failure) symbol (not currently in use).
--prefix=<string> (default: "test")
Output filename prefix.
--sb=<string> (default: "<s>")
Sentence begin symbol.
--se=<string> (default: "</s>")
Sentence end symbol.
--split=<string> (default: "|")
Delimiter used to split mult-token symbols.
--start=<string> (default: "<start>")
Start symbol.
--write_syms=<bool> (default: false)
Write the symbol tables to disk.
--fst_compat_symbols=<bool> (default: true)
Require symbol tables to match when appropriate
--fst_field_separator=<string> (default: " ")
Set of characters used as a separator between printed fields
--fst_error_fatal=<bool> (default: true)
FST errors are fatal; o.w. return objects flagged as bad: e.g., FSTs - kError prop. true, FST weights - not a Member()