Bio::Structure::SecStr::DSSP::Res(3) Module for parsing/accessing dssp output

SYNOPSIS


my $dssp_obj = Bio::Structure::SecStr::DSSP::Res->new('-file'=>'filename.dssp');
# or
my $dssp_obj = Bio::Structure::SecStr::DSSP::Res->new('-fh'=>\*STDOUT);
# get DSSP defined Secondary Structure for residue 20
$sec_str = $dssp_obj->resSecStr( 20 );
# get dssp defined sec. structure summary for PDB residue # 10 of chain A
$sec_str = $dssp_obj->resSecStrSum( '10:A' );

DESCRIPTION

DSSP::Res is a module for objectifying DSSP output. Methods are then available for extracting all the information within the output file and convenient subsets of it. The principal purpose of DSSP is to determine secondary structural elements of a given structure.

    ( Dictionary of protein secondary structure: pattern recognition
      of hydrogen-bonded and geometrical features.
      Biopolymers. 1983 Dec;22(12):2577-637. )

The DSSP program is available from:
  http://www.cmbi.kun.nl/swift/dssp

This information is available on a per residue basis ( see resSecStr and resSecStrSum methods ) or on a per chain basis ( see secBounds method ).

resSecStr() & secBounds() return one of the following:
    'H' = alpha helix
    'B' = residue in isolated beta-bridge
    'E' = extended strand, participates in beta ladder
    'G' = 3-helix (3/10 helix)
    'I' = 5 helix (pi helix)
    'T' = hydrogen bonded turn
    'S' = bend
    ''  = no assignment

A more general classification is returned using the resSecStrSum() method. The purpose of this is to have a method for DSSP and STRIDE derived output whose range is the same. Its output is one of the following:

    'H' = helix         ( => 'H', 'G', or 'I' from above )
    'B' = beta          ( => 'B' or 'E' from above )
    'T' = turn          ( => 'T' or 'S' from above )
    ' ' = no assignment ( => ' ' from above )

The methods are roughly divided into 3 sections: 1. Global features of this structure (PDB ID, total surface area,
    etc.).  These methods do not require an argument. 2. Residue specific features ( amino acid, secondary structure,
    solvent exposed surface area, etc. ).  These methods do require an
    argument.  The argument is supposed to uniquely identify a
    residue described within the structure.  It can be of any of the
    following forms:
    ('#A:B') or ( #, 'A', 'B' )
      || |
      || - Chain ID (blank for single chain)
      |--- Insertion code for this residue.  Blank for most residues.
      |--- Numeric portion of residue ID.

    (#)
     |
     --- Numeric portion of residue ID.  If there is only one chain and
         it has no ID AND there is no residue with an insertion code at this
         number, then this can uniquely specify a residue.
    ('#:C') or ( #, 'C' )
      | |
      | -Chain ID
      ---Numeric portion of residue ID.
  If a residue is incompletely specified then the first residue that
  fits the arguments is returned.  For example, if 19 is the argument
  and there are three chains, A, B, and C with a residue whose number
  is 19, then 19:A will be returned (assuming its listed first).
  Since neither DSSP nor STRIDE correctly handle alt-loc codes, they
  are not supported by these modules.

3. Value-added methods. Return values are not verbatem strings
    parsed from DSSP or STRIDE output.

FEEDBACK

Mailing Lists

User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated.

  [email protected]                  - General discussion
  http://bioperl.org/wiki/Mailing_lists  - About the mailing lists

Support

Please direct usage questions or support issues to the mailing list:

[email protected]

rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible.

Reporting Bugs

Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the web:

  https://github.com/bioperl/bioperl-live/issues

AUTHOR - Ed Green

Email [email protected]

APPENDIX

The rest of the documentation details each method. Internal methods are preceded with a _

CONSTRUCTOR

new

 Title         : new
 Usage         : makes new object of this class
 Function      : Constructor
 Example       : $dssp_obj = Bio::DSSP:Res->new( filename or FILEHANDLE )
 Returns       : object (ref)
 Args          : filename ( must be proper DSSP output file )

ACCESSORS

totSurfArea

 Title         : totSurfArea
 Usage         : returns total accessible surface area in square And.
 Function      :
 Example       : $surArea = $dssp_obj->totSurfArea();
 Returns       : scalar
 Args          : none

numResidues

 Title         : numResidues
 Usage         : returns the total number of residues in all chains or
                 just the specified chain if a chain is specified
 Function      :
 Example       : $num_res = $dssp_obj->numResidues();
 Returns       : scalar int
 Args          : none

pdbID

 Title         : pdbID
 Usage         : returns pdb identifier ( 1FJM, e.g.)
 Function      :
 Example       : $pdb_id = $dssp_obj->pdbID();
 Returns       : scalar string
 Args          : none

pdbAuthor

 Title         : pdbAuthor
 Usage         : returns author field
 Function      :
 Example       : $auth = $dssp_obj->pdbAuthor()
 Returns       : scalar string
 Args          : none

pdbCompound

 Title         : pdbCompound
 Usage         : returns pdbCompound given in PDB file
 Function      :
 Example       : $cmpd = $dssp_obj->pdbCompound();
 Returns       : scalar string
 Args          : none

pdbDate

 Title         : pdbDate
 Usage         : returns date given in PDB file
 Function      :
 Example       : $pdb_date = $dssp_obj->pdbDate();
 Returns       : scalar
 Args          : none

pdbHeader

 Title         : pdbHeader
 Usage         : returns header info from PDB file
 Function      :
 Example       : $header = $dssp_obj->pdbHeader();
 Returns       : scalar
 Args          : none

pdbSource

 Title         : pdbSource
 Usage         : returns pdbSource information from PDBSOURCE line
 Function      :
 Example       : $pdbSource = $dssp_obj->pdbSource();
 Returns       : scalar
 Args          : none

resAA

 Title         : resAA
 Usage         : fetches the 1 char amino acid code, given an id
 Function      :
 Example       : $aa = $dssp_obj->resAA( '20:A' ); # pdb id as arg
 Returns       : 1 character scalar string
 Args          : RESIDUE_ID

resPhi

 Title         : resPhi
 Usage         : returns phi angle of a single residue
 Function      : accessor
 Example       : $phi = $dssp_obj->resPhi( RESIDUE_ID )
 Returns       : scalar
 Args          : RESIDUE_ID

resPsi

 Title         : resPsi
 Usage         : returns psi angle of a single residue
 Function      : accessor
 Example       : $psi = $dssp_obj->resPsi( RESIDUE_ID )
 Returns       : scalar
 Args          : RESIDUE_ID

resSolvAcc

 Title         : resSolvAcc
 Usage         : returns solvent exposed area of this residue in
                 square Andstroms
 Function      :
 Example       : $solv_acc = $dssp_obj->resSolvAcc( RESIDUE_ID );
 Returns       : scalar
 Args          : RESIDUE_ID

resSurfArea

 Title         : resSurfArea
 Usage         : returns solvent exposed area of this residue in
                 square Andstroms
 Function      :
 Example       : $solv_acc = $dssp_obj->resSurfArea( RESIDUE_ID );
 Returns       : scalar
 Args          : RESIDUE_ID

resSecStr

 Title         : resSecStr
 Usage         : $ss = $dssp_obj->resSecStr( RESIDUE_ID );
 Function      : returns the DSSP secondary structural designation of this residue
 Example       :
 Returns       : a character ( 'B', 'E', 'G', 'H', 'I', 'S', 'T', or ' ' )
 Args          : RESIDUE_ID
 NOTE          : The range of this method differs from that of the
    resSecStr method in the STRIDE SecStr parser.  That is because of the
    slightly different format for STRIDE and DSSP output.  The resSecStrSum
    method exists to map these different ranges onto an identical range.

resSecStrSum

 Title         : resSecStrSum
 Usage         : $ss = $dssp_obj->resSecStrSum( $id );
 Function      : returns what secondary structure group this residue belongs
                 to.  One of:  'H': helix ( H, G, or I )
                               'B': beta  ( B or E )
                               'T': turn  ( T or S )
                               ' ': none  ( ' ' )
                 This method is similar to resSecStr, but the information
                 it returns is less specific.
 Example       :
 Returns       : a character ( 'H', 'B', 'T', or ' ' )
 Args          : dssp residue number of pdb residue identifier

hBonds

 Title         : hBonds
 Usage         : returns number of 14 different types of H Bonds
 Function      :
 Example       : $hb = $dssp_obj->hBonds
 Returns       : pointer to 14 element array of ints
 Args          : none
 NOTE          : The different type of H-Bonds reported are, in order:
    TYPE O(I)-->H-N(J)
    IN PARALLEL BRIDGES
    IN ANTIPARALLEL BRIDGES
    TYPE O(I)-->H-N(I-5)
    TYPE O(I)-->H-N(I-4)
    TYPE O(I)-->H-N(I-3)
    TYPE O(I)-->H-N(I-2)
    TYPE O(I)-->H-N(I-1)
    TYPE O(I)-->H-N(I+0)
    TYPE O(I)-->H-N(I+1)
    TYPE O(I)-->H-N(I+2)
    TYPE O(I)-->H-N(I+3)
    TYPE O(I)-->H-N(I+4)
    TYPE O(I)-->H-N(I+5)

numSSBr

 Title         : numSSBr
 Usage         : returns info about number of SS-bridges
 Function      :
 Example       : @SS_br = $dssp_obj->numSSbr();
 Returns       : 3 element scalar int array
 Args          : none

resHB_O_HN

 Title         : resHB_O_HN
 Usage         : returns pointer to a 4 element array
                 consisting of: relative position of binding
                 partner #1, energy of that bond (kcal/mol),
                 relative positionof binding partner #2,
                 energy of that bond (kcal/mol).  If the bond
                 is not bifurcated, the second bond is reported
                 as 0, 0.0
 Function      : accessor
 Example       : $oBonds_ptr = $dssp_obj->resHB_O_HN( RESIDUE_ID )
 Returns       : pointer to 4 element array
 Args          : RESIDUE_ID

resHB_NH_O

 Title         : resHB_NH_O
 Usage         : returns pointer to a 4 element array
                 consisting of: relative position of binding
                 partner #1, energy of that bond (kcal/mol),
                 relative positionof binding partner #2,
                 energy of that bond (kcal/mol).  If the bond
                 is not bifurcated, the second bond is reported
                 as 0, 0.0
 Function      : accessor
 Example       : $nhBonds_ptr = $dssp_obj->resHB_NH_O( RESIDUE_ID )
 Returns       : pointer to 4 element array
 Args          : RESIDUE_ID

resTco

 Title         : resTco
 Usage         : returns tco angle around this residue
 Function      : accessor
 Example       : resTco = $dssp_obj->resTco( RESIDUE_ID )
 Returns       : scalar
 Args          : RESIDUE_ID

resKappa

 Title         : resKappa
 Usage         : returns kappa angle around this residue
 Function      : accessor
 Example       : $kappa = $dssp_obj->resKappa( RESIDUE_ID )
 Returns       : scalar
 Args          : RESIDUE_ID ( dssp or PDB )

resAlpha

 Title         : resAlpha
 Usage         : returns alpha angle around this residue
 Function      : accessor
 Example       : $alpha = $dssp_obj->resAlpha( RESIDUE_ID )
 Returns       : scalar
 Args          : RESIDUE_ID ( dssp or PDB )

secBounds

 Title         : secBounds
 Usage         : gets residue ids of boundary residues in each
                 contiguous secondary structural element of specified
                 chain
 Function      : returns pointer to array of 3 element arrays.  First
                 two elements are the PDB IDs of the start and end points,
                 respectively and inclusively.  The last element is the
                 DSSP secondary structural assignment code,
                 i.e. one of : ('B', 'E', 'G', 'H', 'I', 'S', 'T', or ' ')
 Example       : $ss_elements_pts = $dssp_obj->secBounds( 'A' );
 Returns       : pointer to array of arrays
 Args          : chain id ( 'A', for example ).  No arg => no chain id

chains

 Title         : chains
 Usage         : returns pointer to array of chain I.D.s (characters)
 Function      :
 Example       : $chains_pnt = $dssp_obj->chains();
 Returns       : array of characters, one of which may be ' '
 Args          : none

residues

    Title : residues
    Usage : returns array of residue identifiers for all residues in
    the output file, or in a specific chain
    Function :
    Example : @residues_ids = $dssp_obj->residues()
    Returns : array of residue identifiers
    Args : if none => returns residue ids of all residues of all
    chains (in order); if chain id is given, returns just the residue
    ids of residues in that chain

getSeq

 Title         : getSeq
 Usage         : returns a Bio::PrimarySeq object which represents a good
                 guess at the sequence of the given chain
 Function      : For most chains of most entries, the sequence returned by
                 this method will be very good.  However, it is inherently
                 unsafe to rely on DSSP to extract sequence information about
                 a PDB entry.  More reliable information can be obtained from
                 the PDB entry itself.
 Example       : $pso = $dssp_obj->getSeq( 'A' );
 Returns       : (pointer to) a PrimarySeq object
 Args          : Chain identifier.  If none given, ' ' is assumed.  If no ' '
                 chain, the first chain is used.

INTERNAL METHODS

_pdbChain

 Title         : _pdbChain
 Usage         : returns the pdb chain id of given residue
 Function      :
 Example       : $chain_id = $dssp_obj->pdbChain( DSSP_KEY );
 Returns       : scalar
 Args          : DSSP_KEY ( dssp or pdb )

_resAA

 Title         : _resAA
 Usage         : fetches the 1 char amino acid code, given a dssp id
 Function      :
 Example       : $aa = $dssp_obj->_resAA( dssp_id );
 Returns       : 1 character scalar string
 Args          : dssp_id

_pdbNum

 Title        : _pdbNum
 Usage        : fetches the numeric portion of the identifier for a given
                residue as reported by the pdb entry.  Note, this DOES NOT
                uniquely specify a residue.  There may be an insertion code
                and/or chain identifier differences.
 Function     :
 Example      : $pdbNum = $self->_pdbNum( DSSP_ID );
 Returns      : a scalar
 Args         : DSSP_ID

_pdbInsCo

 Title        : _pdbInsCo
 Usage        : fetches the Insertion Code for this residue, if it has one.
 Function     :
 Example      : $pdbNum = $self->_pdbInsCo( DSSP_ID );
 Returns      : a scalar
 Args         : DSSP_ID

_toPdbId

 Title        : _toPdbId
 Usage        : Takes a dssp key and builds the corresponding
                PDB identifier string
 Function     :
 Example      : $pdbId = $self->_toPdbId( DSSP_ID );
 Returns      : scalar
 Args         : DSSP_ID

_contSegs

 Title         : _contSegs
 Usage         : find the endpoints of continuous regions of this structure
 Function      : returns pointer to array of 3 element array.
                 Elements are the dssp keys of the start and end points of each
                 continuous element and its PDB chain id (may be blank).
                 Note that it is common to have several
                 continuous elements with the same chain id.  This occurs
                 when an internal region is disordered and no structural
                 information is available.
 Example       : $cont_seg_ptr = $dssp_obj->_contSegs();
 Returns       : pointer to array of arrays
 Args          : none

_numResLines

 Title         : _numResLines
 Usage         : returns the total number of residue lines in this
                 dssp file.
                 This number is DIFFERENT than the number of residues in
                 the pdb file because dssp has chain termination and chain
                 discontinuity 'residues'.
 Function      :
 Example       : $num_res = $dssp_obj->_numResLines();
 Returns       : scalar int
 Args          : none

_toDsspKey

 Title         : _toDsspKey
 Usage         : returns the unique dssp integer key given a pdb residue id.
                 All accessor methods require (internally)
                 the dssp key.   This method is very useful in converting
                 pdb keys to dssp keys so the accessors can accept pdb keys
                 as argument.  PDB Residue IDs are inherently
                 problematic since they have multiple parts of
                 overlapping function and ill-defined or observed
                 convention in form.  Input can be in any of the formats
                 described in the DESCRIPTION section above.
 Function      :
 Example       : $dssp_id = $dssp_obj->_pdbKeyToDsspKey( '10B:A' )
 Returns       : scalar int
 Args          : pdb residue identifier: num[insertion code]:[chain]

_parse

 Title         : _parse
 Usage         : parses dssp output
 Function      :
 Example       : used by the constructor
 Returns       :
 Args          : input source ( handled by Bio::Root:IO )

_parseResLine

 Title         : _parseResLine
 Usage         : parses a single residue line
 Function      :
 Example       : used internally
 Returns       :
 Args          : residue line ( string )