SYNOPSIS
hhalign ,-i query /[,-t template/] [,options/]DESCRIPTION
HHalign version 2.0.16 (January 2013) Align a query alignment/HMM to a template alignment/HMM by HMM-HMM alignment If only one alignment/HMM is given it is compared to itself and the best off-diagonal alignment plus all further non-overlapping alignments above significance threshold are shown. Remmert M, Biegert A, Hauser A, and Soding J. HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat. Methods 9:173-175 (2011). (C) Johannes Soeding, Michael Remmert, Andreas Biegert, Andreas Hauser- -i <file>
- input query alignment (fasta/a2m/a3m) or HMM file (.hhm)
- -t <file>
- input template alignment (fasta/a2m/a3m) or HMM file (.hhm)
Output options:
- -o <file>
- write output alignment to file
- -ofas <file>
- write alignments in FASTA, A2M (-oa2m) or A3M (-oa3m) format
- -Oa3m <file>
- write query alignment in a3m format to file (default=none)
- -Aa3m <file>
- append query alignment in a3m format to file (default=none)
- -atab <file>
- write alignment as a table (with posteriors) to file (default=none)
-
-index <file> use given alignment to calculate Viterbi score (default=none)
- -v <int>
-
- verbose mode: 0:no screen output 1:only warings 2: verbose
- -seq
- [1,inf[ max. number of query/template sequences displayed (def=1)
- -nocons
- don't show consensus sequence in alignments (default=show)
- -nopred
- don't show predicted 2ndary structure in alignments (default=show)
- -nodssp
- don't show DSSP 2ndary structure in alignments (default=show)
- -ssconf
- show confidences for predicted 2ndary structure in alignments
- -aliw int
- number of columns per line in alignment list (def=80)
- -P <float>
- for self-comparison: max p-value of alignments (def=0.001
- -p <float>
- minimum probability in summary and alignment list (def=0)
- -E <float>
- maximum E-value in summary and alignment list (def=1E+06)
- -Z <int>
- maximum number of lines in summary hit list (def=100)
- -z <int>
- minimum number of lines in summary hit list (def=1)
- -B <int>
- maximum number of alignments in alignment list (def=100)
- -b <int>
- minimum number of alignments in alignment list (def=1)
- -rank int
- specify rank of alignment to write with -Oa3m or -Aa3m option (default=1)
Filter input alignment (options can be combined):
- -id
- [0,100] maximum pairwise sequence identity (%) (def=90)
- -diff [0,inf[ filter most diverse set of sequences, keeping at least this
-
- many sequences in each block of >50 columns (def=100)
- -cov
- [0,100] minimum coverage with query (%) (def=0)
- -qid
- [0,100] minimum sequence identity with query (%) (def=0)
- -qsc
- [0,100] minimum score per column with query (def=-20.0)
Input alignment format:
- -M a2m
- use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted)
- -M first
- use FASTA: columns with residue in 1st sequence are match states
- -M [0,100]
- use FASTA: columns with fewer than X% gaps are match states
HMM-HMM alignment options:
- -glob/-loc
- global or local alignment mode (def=local)
- -alt <int>
- show up to this number of alternative alignments (def=1)
- -realign
- realign displayed hits with max. accuracy (MAC) algorithm
- -norealign
- do NOT realign displayed hits with MAC algorithm (def=realign)
- -mact [0,1[
- posterior probability threshold for MAC alignment (def=0.350) A threshold value of 0.0 yields global alignments.
- -sto <int>
- use global stochastic sampling algorithm to sample this many alignments
-
-excl <range> exclude query positions from the alignment, e.g. '1-33,97-168'
- -shift [-1,1] score offset (def=-0.030)
- -corr [0,1]
- -shift [-1,1] score offset (def=-0.030)
-
- weight of term for pair correlations (def=0.10)
- -ssm
- 0-4 0:no ss scoring [default=2]
- 1:ss scoring after alignment 2:ss scoring during alignment
- -ssw
- [0,1] weight of ss score (def=0.11)
- -def
- read default options from ./.hhdefaults or <home>/.hhdefault.
Example: hhalign -i T0187.a3m -t d1hz4a_.hhm -png T0187pdb.png
Output options:
- -o <file>
- write output alignment to file
- -ofas <file>
- write alignments in FASTA, A2M (-oa2m) or A3M (-oa3m) format
- -Oa3m <file>
- write query alignment in a3m format to file (default=none)
- -Aa3m <file>
- append query alignment in a3m format to file (default=none)
- -atab <file>
- write alignment as a table (with posteriors) to file (default=none)
- -v <int>
- verbose mode: 0:no screen output 1:only warings 2: verbose
- -seq
- [1,inf[ max. number of query/template sequences displayed (def=1)
- -nocons
- don't show consensus sequence in alignments (default=show)
- -nopred
- don't show predicted 2ndary structure in alignments (default=show)
- -nodssp
- don't show DSSP 2ndary structure in alignments (default=show)
- -ssconf
- show confidences for predicted 2ndary structure in alignments
- -aliw int
- number of columns per line in alignment list (def=80)
- -P <float>
- for self-comparison: max p-value of alignments (def=0.001
- -p <float>
- minimum probability in summary and alignment list (def=0)
- -E <float>
- maximum E-value in summary and alignment list (def=1E+06)
- -Z <int>
- maximum number of lines in summary hit list (def=100)
- -z <int>
- minimum number of lines in summary hit list (def=1)
- -B <int>
- maximum number of alignments in alignment list (def=100)
- -b <int>
- minimum number of alignments in alignment list (def=1)
- -rank int
- specify rank of alignment to write with -Oa3m or -Aa3m option (default=1)
- -tc <file>
- write a TCoffee library file for the pairwise comparison
- -tct [0,100]
- min. probobability of residue pairs for TCoffee (def=5%)
Options to filter input alignment (options can be combined):
- -id
- [0,100] maximum pairwise sequence identity (%) (def=90)
- -diff [0,inf[
- filter most diverse set of sequences, keeping at least this many sequences in each block of >50 columns (def=100)
- -cov
- [0,100] minimum coverage with query (%) (def=0)
- -qid
- [0,100] minimum sequence identity with query (%) (def=0)
- -qsc
- [0,100] minimum score per column with query (def=-20.0)
HMM-building options:
- -M a2m
- use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted)
- -M first
- use FASTA: columns with residue in 1st sequence are match states
- -M [0,100]
- use FASTA: columns with fewer than X% gaps are match states
- -tags
- do NOT neutralize His-, C-myc-, FLAG-tags, and trypsin recognition sequence to background distribution
Pseudocount (pc) options:
- -pcm
- 0-2 position dependence of pc admixture 'tau' (pc mode, default=2)
- 0: no pseudo counts:
- tau = 0
- 1: constant
- tau = a
- 2: diversity-dependent: tau = a/(1 + ((Neff[i]-1)/b)^c) (Neff[i]: number of effective seqs in local MSA around column i) 3: constant diversity pseudocounts
- -pca
- [0,1] overall pseudocount admixture (def=1.0)
- -pcb
- [1,inf[ Neff threshold value for -pcm 2 (def=1.5)
- -pcc
- [0,3] extinction exponent c for -pcm 2 (def=1.0)
- -pre_pca [0,1]
- PREFILTER pseudocount admixture (def=0.8)
- -pre_pcb [1,inf[ PREFILTER threshold for Neff (def=1.8)
Context-specific pseudo-counts:
- -nocontxt
-
- use substitution-matrix instead of context-specific pseudocounts
-
-contxt <file> context file for computing context-specific pseudocounts (default=./data/context_data.lib)
- -cslib
- <file> column state file for fast database prefiltering (default=./data/cs219.lib)
Gap cost options:
- -gapb [0,inf[
-
- Transition pseudocount admixture (def=1.00)
- -gapd [0,inf[
- Transition pseudocount admixture for open gap (default=0.15)
- -gape [0,1.5]
- Transition pseudocount admixture for extend gap (def=1.00)
- -gapf ]0,inf]
- factor to increase/reduce the gap open penalty for deletes (def=0.60)
- -gapg ]0,inf]
- factor to increase/reduce the gap open penalty for inserts (def=0.60)
- -gaph ]0,inf]
- factor to increase/reduce the gap extend penalty for deletes(def=0.60)
- -gapi ]0,inf]
- factor to increase/reduce the gap extend penalty for inserts(def=0.60)
- -egq
- [0,inf[ penalty (bits) for end gaps aligned to query residues (def=0.00)
- -egt
- [0,inf[ penalty (bits) for end gaps aligned to template residues (def=0.00)
Alignment options:
- -glob/-loc
- global or local alignment mode (def=global)
- -mac
- use Maximum Accuracy (MAC) alignment instead of Viterbi
- -mact [0,1]
- posterior prob threshold for MAC alignment (def=0.350)
- -sto <int>
- use global stochastic sampling algorithm to sample this many alignments
- -sc
- <int> amino acid score (tja: template HMM at column j) (def=1)
- 0
- = log2 Sum(tja*qia/pa) (pa: aa background frequencies)
- 1
- = log2 Sum(tja*qia/pqa) (pqa = 1/2*(pa+ta) )
- 2
- = log2 Sum(tja*qia/ta) (ta: av. aa freqs in template)
- 3
- = log2 Sum(tja*qia/qa) (qa: av. aa freqs in query)
- -corr [0,1]
- weight of term for pair correlations (def=0.10)
- -shift [-1,1]
- score offset (def=-0.030)
- -r
- repeat identification: multiple hits not treated as independent
- -ssm
- 0-2 0:no ss scoring [default=2]
- 1:ss scoring after alignment 2:ss scoring during alignment
- -ssw
- [0,1] weight of ss score compared to column score (def=0.11)
- -ssa
- [0,1] ss confusion matrix = (1-ssa)*I + ssa*psipred-confusion-matrix [def=1.00)
- -calm 0-3
- empirical score calibration of 0:query 1:template 2:both (def=off)
Default options can be specified in './.hhdefaults' or '~/.hhdefaults'