fsm-lite(1)
Frequency-based String Mining
SYNOPSIS
fsm-lite
-l <file> -t <file> [options]
DESCRIPTION
A singe-core implementation of frequency-based substring mining used in
bioinformatics to extract substrings that discriminate two (or more)
datasets inside high-throughput sequencing data.
OPTIONS
mandatory:
- -l,--list <file>
-
Text file that lists all input files as whitespace-separated pairs
-
<data-name> <data-filename>
-
where <data-name> is unique identifier (without whitespace)
and <data-filename> is full path to each input file.
Default data file format is FASTA (uncompressed).
- -t,--tmp <file>
-
Store temporary index data
optional:
- -m,--min <int>
-
Minimum length to report (default 9)
- -M,--max <int>
-
Maximum length to report (default 100)
- -f,--freq <int>
-
Minimum frequency per input file to report (default 1)
- -s,--minsupp <int>
-
Minimum number of input files with support to report (default 2)
- -S,--maxsupp <int>
-
Maximum number of input files with support to report (default inf)
- -v,--verbose
-
Verbose output
AUTHOR
This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.