m2m-aligner(1) Parallel corpora aligner.


m2m-aligner [OPTIONS]



This tool aims at aligning parallel corpora. It can been used as a preliminary data preparation step in both machine translation and grapheme to phoneme training.


--limit Limit the alignment pair to used only from the initFile only (default false)

--errorInFile Keep unaligned item in the output file (default false)

--initProb <long double> Cut-off sum prior probability (default 0.5)

--init <string> Initial mapping (model) filename (default null)

--nBest <int> Generate n-best alignments (default n=1)

--inFormat <l2p|news> Input file format [l2p, news] (default news)

--sepInChar <string> Separated in-character used (default :)

--sepChar <string> Separated character used (default |)

--nullChar <string> Null character used (default _)

--pProcess <string> Specify prefix output files

--pScore Report score of each alignment (default false)

--cutoff <double> Training threshold (default 0.01)

--maxFn <conXY|conYX|joint> Maximization function [conXY, conYX, joint] (default conYX)

--eqMap Allow mapping of |x| == |y| > 1 (default false)

--delY Allow deletion of substring y (default false)

--delX Allow deletion of substring x (default false)

--maxY <int> Maximum length of substring y (default 2)

--maxX <int> Maximum length of substring x (default 2)

--alignerIn <string> Aligner model input filename

--alignerOut <string> Aligner model output filename

-o <string>, --outputFile <string> Output filename

-i <string>, --inputFile <string> (required) Input filename

--, --ignore_rest Ignores the rest of the labeled arguments following this flag.

--version Displays version information and exits.

-h, --help Displays usage information and exits.