man phonetisaurus-calculateER (1): estimates grapheme-to-phoneme error rate

SYNOPSIS

phonetisaurus-calculateER --hyp "hypseq or file" --ref "refseq or file" --usep "" [OPTIONS]

DESCRIPTION

phonetisaurus-calculateER

This tool evaluates performance of grapheme-to-phoneme tools.

OPTIONS

-h, --help

: show this help message and exit
--hyp HYP, -w HYP
: The file/string containing G2P/ASR hypotheses.
--ref REF, -r REF
: The file/string containing G2P/ASR reference transcriptions.
--usep USEP, -u USEP
: Character or regex separating units in a sequence. Defaults to ' '.
--fsep FSEP, -s FSEP
: Character or regex separating fields in a sequence. Defaults to '\t'.
--format FORMAT, -f FORMAT
: Input format. One of 'cmu', 'htk', 'g2p'. Defaults to 'g2p'.
--ignore IGNORE, -i IGNORE
: Ignore specified characters when encountered in a HYPOTHESIS. A ' ' separated list.
--regex_ignore REGEX_IGNORE, -n REGEX_IGNORE
: Ignore specified characters when encountered in a HYPOTHESIS. A regular expression.
--ignore_both, -b
: Apply --ignore and --regex_ignore to both the HYPOTHESIS and REFERENCE files. Useful for analysis.
--testfile TESTFILE, -t TESTFILE
: The test file in dictionary format. 1 word, 1 pronunciation per line, separated by '\t'.
--prefix PREFIX, -p PREFIX
: Prefix used to generate the wordlist, hypothesis and reference files. Defaults to 'test'.
--modelfile MODELFILE, -m MODELFILE
: Path to the phoneticizer model.
--mbrdecode, -e
: Use the LMBR decoder.
--alpha ALPHA, -a ALPHA
: Alpha for the mbr decoder.
--order ORDER, -o ORDER
: N-gram order for the mbr decoder.
--precision PRECISION, -x PRECISION
: Avg. N-gram precision factor for LMBR decoder. (.85)
--ratio RATIO, -y RATIO
: N-gram ratio factor for LMBR decoder. (.72)
--beam BEAM, -z BEAM
: LMBR/N-best search beam. Larger->Slower, better. (1500)
--verbose, -v
: Verbose mode.

SYNOPSIS

DESCRIPTION

OPTIONS

LAST SEARCHED