fsm-lite(1) Frequency-based String Mining


fsm-lite -l <file> -t <file> [options]


A singe-core implementation of frequency-based substring mining used in bioinformatics to extract substrings that discriminate two (or more) datasets inside high-throughput sequencing data.



-l,--list <file>
Text file that lists all input files as whitespace-separated pairs
<data-name> <data-filename>
where <data-name> is unique identifier (without whitespace) and <data-filename> is full path to each input file. Default data file format is FASTA (uncompressed).
-t,--tmp <file>
Store temporary index data


-m,--min <int>
Minimum length to report (default 9)
-M,--max <int>
Maximum length to report (default 100)
-f,--freq <int>
Minimum frequency per input file to report (default 1)
-s,--minsupp <int>
Minimum number of input files with support to report (default 2)
-S,--maxsupp <int>
Maximum number of input files with support to report (default inf)
Verbose output


This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.