SYNOPSIS
phonetisaurus-arpa2fst --input=arpa.lm --prefix=output_prefix [OPTIONS]
DESCRIPTION
phonetisaurus-arpa2fst
This tool converts an ARPA language model into a weighted finite transducer that can be used with phonetisaurus-g2p.
OPTIONS
- --help=<bool> (default: false)
-
- show usage information
- --helpshort=<bool> (default: false)
-
- show brief usage information
- --tmpdir=<string> (default: "/tmp/")
-
- temporary directory
- --v=<int32> (default: 0)
-
- verbose level
- --fst_align=<bool> (default: false)
-
- Write FST data aligned where appropriate
- --fst_default_cache_gc=<bool> (default: true)
-
- Enable garbage collection of cache
- --fst_default_cache_gc_limit=<int64> (default: 1048576)
-
- Cache byte size that triggers garbage collection
- --fst_verify_properties=<bool> (default: false)
-
- Verify fst properties queried by TestProperties
- --fst_weight_parentheses=<string> (default: "")
-
- Characters enclosing the first weight of a printed composite weight (e.g. pair weight, tuple weight and derived classes) to ensure proper I/O of nested composite weights; must have size 0 (none) or 2 (open and close parenthesis)
- --fst_weight_separator=<string> (default: ",")
-
- Character separator between printed composite weights; must be a single character
- --save_relabel_ipairs=<string> (default: "")
-
- Save input relabel pairs to file
- --save_relabel_opairs=<string> (default: "")
-
- Save output relabel pairs to file
- --delim=<string> (default: "}")
-
- Delimiter used to separate input and output tokens.
- --eps=<string> (default: "<eps>")
-
- Epsilon symbol.
- --input=<string> (default: "")
-
- Input ARPA-format LM.
- --null_sep=<string> (default: "_")
-
- Graphemic null symbol.
- --phi=<string> (default: "<phi>")
-
- Optional Phi (failure) symbol (not currently in use).
- --prefix=<string> (default: "test")
-
- Output filename prefix.
- --sb=<string> (default: "<s>")
-
- Sentence begin symbol.
- --se=<string> (default: "</s>")
-
- Sentence end symbol.
- --split=<string> (default: "|")
-
- Delimiter used to split mult-token symbols.
- --start=<string> (default: "<start>")
-
- Start symbol.
- --write_syms=<bool> (default: false)
-
- Write the symbol tables to disk.
- --fst_compat_symbols=<bool> (default: true)
-
- Require symbol tables to match when appropriate
- --fst_field_separator=<string> (default: " ")
-
- Set of characters used as a separator between printed fields
- --fst_error_fatal=<bool> (default: true)
-
- FST errors are fatal; o.w. return objects flagged as bad: e.g., FSTs - kError prop. true, FST weights - not a Member()
- FST errors are fatal; o.w. return objects flagged as bad: e.g., FSTs - kError prop. true, FST weights - not a Member()