SYNOPSIS
fastBlockSearch [options] seqs.fa fam.prfl
DESCRIPTION
Searches hits (matches) of the blocks in the profile given by fam.prfl within the genomic sequences in the file seqs.fa. Hits are sorted by increasing score, so the last displayed hit is the best one found in the region. The format is similar to that of the blockscore file (which is optionally generated by msa2prfl.pl): It shows coordinate, strand, mean odds- ratio score, and specificity of score, and the motif. From the output users can choose regions with matching blocks to perform gene prediction with AUGUSTUS-PPX using the same block profile.
OPTIONS
--cutoff=c
- This minimum for the average log score of the motifs found can be used to adjust the sensitivity of the block search. The standard cutoff is 0.7, which is very sensitive but can give many false positive hits for smaller profiles.
EXAMPLE
-
> fastBlockSearch --cutoff=1.1 chr4.103M.fa PF00225_seed.prfl
-
Hits found in chr4 103000000 105000000 Score:207.987 Mult. score:4.83391 1081586 unknown_M[5,13] - 2.32574 5.04633 .....YATRLKNI 1103952 unknown_L - 4.85363 6.75245 NAKTRIICTITP 1103991 unknown_K - 8.38065 9.92928 YRDSKLTRILQNSLG 1104375 unknown_J - 3.96065 6.79408 RSLFILGQVIKKL 1106992 unknown_I - 9.22487 7.64306 LVDLAGSE 1115567 unknown_H[5,16] - 2.31869 5.58986 .....ESRHYGETKMN 1116319 unknown_G - 7.34282 8.29425 EIYNETITDLL 1117092 unknown_F - 5.10694 6.10274 VIPRAIHDIF 1117146 unknown_E - 9.43596 9.18891 QTASGKTYTM 1117176 unknown_D[1,8] - 5.73796 6.31532 .GTIFAYG 1117399 unknown_B[1,7] - 3.59083 5.03059 .CLDRVF 1119420 unknown_A[0,8] - 4.64107 6.44285 RVRPLNSR.
AUTHORS
Oliver Keller