tantan(1)
low complexity and tandem repeat masker for biosequences
SYNOPSIS
tantan
[,options/] ,fasta-sequence-file(s)/
DESCRIPTION
Find simple repeats in sequences
Options (default settings):
- -p
-
interpret the sequences as proteins
- -x
-
letter to use for masking, instead of lowercase
- -c
-
preserve uppercase/lowercase in non-masked regions
- -m
-
file for letter pair scores (+1/-1, but -p selects BLOSUM62)
- -r
-
probability of a repeat starting per position (0.005)
- -e
-
probability of a repeat ending per position (0.05)
- -w
-
maximum tandem repeat period to consider (100, but -p selects 50)
- -d
-
probability decay per period (0.9)
- -a
-
gap existence cost (0)
- -b
-
gap extension cost (infinite: no gaps)
- -s
-
minimum repeat probability for masking (0.5)
- -f
-
output type: 0=masked sequence, 1=repeat probabilities,
-
2=repeat counts, 3=BED (0)
- -h, --help
-
show help message, then exit
- --version
-
show version information, then exit