FreeContact(3) fast protein contact predictor


use FreeContact;
open(EXAMPLE, '<', '/usr/share/doc/libfreecontact-perl/examples/demo_1000.aln') || confess($!);
my @aln = <EXAMPLE>; chomp(@aln); close(EXAMPLE);
my $contacts = FreeContact::Predictor->new()->run(ali => \@aln);
my $predictor = FreeContact::Predictor->new();
my %parset = FreeContact::get_ps_evfold();
my $contacts = $predictor->run(ali => \@aln, %parset, num_threads => 1);
my $predictor = FreeContact::Predictor->new();
my($aliw, $wtot) = $predictor->get_seq_weights(ali => \@aln, num_threads => 1);
my $contacts = $predictor->run_with_seq_weights(ali => \@aln, aliw => $aliw, wtot => $wtot, num_threads => 1);


FreeContact is a protein residue contact predictor optimized for speed. Its input is a multiple sequence alignment. FreeContact can function as an accelerated drop-in for the published contact predictors EVfold-mfDCA of DS. Marks (2011) and PSICOV of D. Jones (2011). FreeContact is accelerated by a combination of vector instructions, multiple threads, and faster implementation of key parts. Depending on the alignment, 10-fold or higher speedups are possible.

A sufficiently large alignment is required for meaningful results. As a minimum, an alignment with an effective (after-weighting) sequence count bigger than the length of the query sequence should be used. Alignments with tens of thousands of (effective) sequences are considered good input.

jackhmmer(1) from the hmmer package, or hhblits(1) from hhsuite can be used to generate the alignments, for example.



Get parameters for EVfold-mfDCA operating mode.
Get parameters for PSICOV 'improved results' operating mode.
Get parameters for PSICOV 'sensible default' operating mode. This is much faster than 'improved results' for a slight loss of precision.

These get_ps_() functions return a hash of arguments (clustpc => num,...,rho => num) that can be used with get_seq_weights(), run() or run_with_seq_weights(). The arguments correspond to the published parametrization of the respective method.



new( dbg => bool )
Creates an ``FreeContact::Predictor''.


Defaults for the arguments are obtained with get_ps_evfold().
run(ali => [], clustpc => dbl, density => dbl, gapth => dbl, mincontsep => uint, pseudocnt => dbl, pscnt_weight => dbl, estimate_ivcov => bool, shrink_lambda => dbl, cov20 => bool, apply_gapth => bool, rho => dbl, [veczw => bool], [num_threads => int], [icme_timeout => int], [timing => {}])
Defaults for the arguments are obtained with get_ps_evfold().
Reference to array holding alignment rows as strings. The first row must hold the query, with no gaps.
BLOSUM-style clustering similarity threshold [0-1].
Inverse covariance matrix estimation timeout in seconds. Default: 1800.

The estimation sometimes gets stuck. If the timeout is reached, the run() method dies with ``Caught FreeContact timeout exception: ...''. You can catch this exception and handle it as needed, e.g. by setting a higher rho value.

Number of OpenMP threads to use. If unset, all CPUs are used.
If given, this hash reference is filled with data containing wall clock timing results in seconds:

    num_threads =>  NUM,
    seqw =>         NUM,
    pairfreq =>     NUM,
    shrink =>       NUM,
    inv =>          NUM,
    all =>          NUM

run() returns a hash reference of contact prediction results:

    fro => [  # identifier of scoring scheme
        I,    # 0-based index of amino acid i
        J,    # 0-based index of amino acid j
        SCORE # contact score
      ], ...
    MI => ...,
    l1norm => ...

Use 'fro' scores with EVfold.


Laszlo Kajan, <[email protected]>


Copyright (C) 2013 by Laszlo Kajan

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.10.1 or, at your option, any later version of Perl 5 you may have available.