Bio::Tree::DistanceFactory(3) Construct a tree using distance based methods

SYNOPSIS


use Bio::Tree::DistanceFactory;
use Bio::AlignIO;
use Bio::Align::DNAStatistics;
my $tfactory = Bio::Tree::DistanceFactory->new(-method => "NJ");
my $stats = Bio::Align::DNAStatistics->new();
my $alnin = Bio::AlignIO->new(-format => 'clustalw',
-file => 'file.aln');
my $aln = $alnin->next_aln;
# Of course matrix can come from a different place
# like PHYLIP if you prefer, Bio::Matrix::IO should be able
# to parse many things
my $jcmatrix = $stats->distance(-align => $aln,
-method => 'Jukes-Cantor');
my $tree = $tfactory->make_tree($jcmatrix);

DESCRIPTION

This is a factory which will construct a phylogenetic tree based on the pairwise sequence distances for a set of sequences. Currently UPGMA (Sokal and Michener 1958) and NJ (Saitou and Nei 1987) tree construction methods are implemented.

REFERENCES

Eddy SR, Durbin R, Krogh A, Mitchison G, (1998) ``Biological Sequence Analysis'', Cambridge Univ Press, Cambridge, UK.

Howe K, Bateman A, Durbin R, (2002) ``QuickTree: building huge Neighbour-Joining trees of protein sequences.'' Bioinformatics 18(11):1546-1547.

Saitou N and Nei M, (1987) ``The neighbor-joining method: a new method for reconstructing phylogenetic trees.'' Mol Biol Evol 4(4):406-25.

FEEDBACK

Mailing Lists

User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated.

  [email protected]                  - General discussion
  http://bioperl.org/wiki/Mailing_lists  - About the mailing lists

Support

Please direct usage questions or support issues to the mailing list:

[email protected]

rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible.

Reporting Bugs

Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted the web:

  https://github.com/bioperl/bioperl-live/issues

AUTHOR - Jason Stajich

Email jason-at-bioperl.org

APPENDIX

The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _

new

 Title   : new
 Usage   : my $obj = Bio::Tree::DistanceFactory->new();
 Function: Builds a new Bio::Tree::DistanceFactory object 
 Returns : an instance of Bio::Tree::DistanceFactory
 Args    : -method => 'NJ' or 'UPGMA'

make_tree

 Title   : make_tree
 Usage   : my $tree = $disttreefact->make_tree($matrix);
 Function: Build a Tree based on a distance matrix
 Returns : L<Bio::Tree::TreeI>
 Args    : L<Bio::Matrix::MatrixI> object

_nj

 Title   : _nj
 Usage   : my $tree = $disttreefact->_nj($matrix);
 Function: Construct a tree based on distance matrix using the 
           Neighbor Joining algorithm (Saitou and Nei, 1987)
           Implementation based on Kevin Howe's Quicktree implementation
           and uses his tricks (some based on Bill Bruno's work) to eliminate
           negative branch lengths
 Returns : L<Bio::Tree::TreeI>
 Args    : L<Bio::Matrix::MatrixI> object

_upgma

 Title   : _upgma
 Usage   : my $tree = $disttreefact->_upgma($matrix);
 Function: Construct a tree based on alignment using UPGMA
 Returns : L<Bio::Tree::TreeI>
 Args    : L<Bio::Matrix::MatrixI> object

method

 Title   : method
 Usage   : $obj->method($newval)
 Function: 
 Example : 
 Returns : value of method (a scalar)
 Args    : on set, new value (a scalar or undef, optional)

check_additivity

 Title     : check_additivity
 Usage     : if( $distance->check_additivity($matrix) ) {
             }
 Function  : See if matrix obeys additivity principal
 Returns   : boolean
 Args      : Bio::Matrix::MatrixI 
 References: Based on a Java implementation by
             Peter Sestoft, [email protected] 1999-12-07 version 0.3
             http://www.dina.kvl.dk/~sestoft/bsa.html
             which in turn is based on algorithms described in 
             R. Durbin, S. Eddy, A. Krogh, G. Mitchison. 
             Biological Sequence Analysis CUP 1998, Chapter 7.