SYNOPSIS
-
creating a TFBS::Matrix::ICM object manually:
my $matrixref = [ [ 0.00, 0.30, 0.00, 0.00, 0.24, 0.00 ],
[ 0.00, 0.00, 0.00, 1.45, 0.42, 0.00 ],
[ 0.00, 0.89, 2.00, 0.00, 0.00, 0.00 ],
[ 0.00, 0.00, 0.00, 0.13, 0.06, 2.00 ]
];
my $icm = TFBS::Matrix::ICM->new(-matrix => $matrixref,
-name => "MyProfile",
-ID => "M0001"
);
# or
my $matrixstring = <<ENDMATRIX
2.00 0.30 0.00 0.00 0.24 0.00
0.00 0.00 0.00 1.45 0.42 0.00
0.00 0.89 2.00 0.00 0.00 0.00
0.00 0.00 0.00 0.13 0.06 2.00
ENDMATRIX
;
my $icm = TFBS::Matrix::ICM->new(-matrixstring => $matrixstring,
-name => "MyProfile",
-ID => "M0001"
); -
retrieving a TFBS::Matix::ICM object from a database:
(See documentation of individual TFBS::DB::* modules to learn how to connect to different types of pattern databases and retrieve TFBS::Matrix::* objects from them.)
my $db_obj = TFBS::DB::JASPAR2->new (-connect => ["dbi:mysql:JASPAR2:myhost", "myusername", "mypassword"]); my $pfm = $db_obj->get_Matrix_by_ID("M0001", "ICM"); # or my $pfm = $db_obj->get_Matrix_by_name("MyProfile", "ICM");
-
retrieving list of individual TFBS::Matrix::ICM objects
from a TFBS::MatrixSet object
(see decumentation of TFBS::MatrixSet to learn how to create objects for storage and manipulation of multiple matrices)
my @icm_list = $matrixset->all_patterns(-sort_by=>"name");
* drawing a sequence logo
$icm->draw_logo(-file=>"logo.png", -full_scale =>2.25, -xsize=>500, -ysize =>250, -graph_title=>"C/EBPalpha binding site logo", -x_title=>"position", -y_title=>"bits");
DESCRIPTION
TFBS::Matrix::ICM is a class whose instances are objects representing position weight matrices (PFMs). An ICM is normally calculated from a raw position frequency matrix (see TFBS::Matrix::PFM for the explanation of position frequency matrices). For example, given the following position frequency matrix,
A:[ 12 3 0 0 4 0 ] C:[ 0 0 0 11 7 0 ] G:[ 0 9 12 0 0 0 ] T:[ 0 0 0 1 1 12 ]
the standard computational procedure is applied to convert it into the following information content matrix:
A:[2.00 0.30 0.00 0.00 0.24 0.00] C:[0.00 0.00 0.00 1.45 0.42 0.00] G:[0.00 0.89 2.00 0.00 0.00 0.00] T:[0.00 0.00 0.00 0.13 0.06 2.00]
which contains the ``weights'' associated with the occurrence of each nucleotide at the given position in a pattern.
A TFBS::Matrix::PWM object is equipped with methods to search nucleotide sequences and pairwise alignments of nucleotide sequences with the pattern they represent, and return a set of sites in nucleotide sequence (a TFBS::SiteSet object for single sequence search, and a TFBS::SitePairSet for the alignment search).
FEEDBACK
Please send bug reports and other comments to the author.AUTHOR - Boris Lenhard
Boris Lenhard <[email protected]>APPENDIX
The rest of the documentation details each of the object methods. Internal methods are preceded with an underscore.new
Title : new Usage : my $icm = TFBS::Matrix::ICM->new(%args) Function: constructor for the TFBS::Matrix::ICM object Returns : a new TFBS::Matrix::ICM object Args : # you must specify either one of the following three: -matrix, # reference to an array of arrays of integers #or -matrixstring,# a string containing four lines # of tab- or space-delimited integers #or -matrixfile, # the name of a file containing four lines # of tab- or space-delimited integers ####### -name, # string, OPTIONAL -ID, # string, OPTIONAL -class, # string, OPTIONAL -tags # an array reference, OPTIONAL
to_PWM
Title : to_PWM Usage : my $pwm = $icm->to_PWM() Function: converts an information content matrix (a TFBS::Matrix::ICM object) to position weight matrix. At present it assumes uniform background distribution of nucleotide frequencies. Returns : a new TFBS::Matrix::PWM object Args : none; in the future releases, it should be able to accept a user defined background probability of the four nucleotides
draw_logo
Title : draw_logo Usage : my $gdImageObj = $icm->draw_logo(%args) Function: Draws a "sequence logo", a graphical representation of a possibly degenerate fixed-width nucleotide sequence pattern, from the information content matrix Returns : a GD::Image object; if you only need the image file you can ignore it Args : -file, # the name of the output PNG image file # OPTIONAL: default none -xsize # width of the image in pixels # OPTIONAL: default 600 -ysize # height of the image in pixels # OPTIONAL: default 5/8 of -x_size -startpos # start position in the logo for x axis # OPTIONAL: default is 1 -margin # size of image margins in pixels # OPTIONAL: default 15% of -y_size -full_scale # the maximum value on the y-axis, in bits # OPTIONAL: default 2.25 -graph_title,# the graph title # OPTIONAL: default none -x_title, # x-axis title; OPTIONAL: default none -y_title # y-axis title; OPTIONAL: default none -error_bars # reference to an array of S.D. values for each column; OPTIONAL -ps # if true, produces a postscript string instead of a GD::Image object -pdf # if true AND the -file argumant is used, produces an output pdf file
_draw_ps_logo
Title : _draw_ps_logo Usage : my $postscript_string = $icm->_draw_ps_logo(%args) Internal method, should be accessed using draw_logo() Function: Draws a "sequence logo", a graphical representation of a possibly degenerate fixed-width nucleotide sequence pattern, from the information content matrix Returns : a postscript string; if you only need the image file you can ignore it Args : -file, # the name of the output PNG image file # OPTIONAL: default none -xsize # width of the image in pixels # OPTIONAL: default 600 -ysize # height of the image in pixels # OPTIONAL: default 5/8 of -x_size -full_scale # the maximum value on the y-axis, in bits # OPTIONAL: default 2.25 -graph_title,# the graph title # OPTIONAL: default none -x_title, # x-axis title; OPTIONAL: default none -y_title # y-axis title; OPTIONAL: default none
_draw_svg_logo
name
ID
class
matrix
length
revcom
rawprint
prettyprint
The above methods are common to all matrix objects. Please consult TFBS::Matrix to find out how to use them.