SYNOPSIS
# Do not use this module directly. Use it via the L<Bio::AlignIO> class.
use Bio::AlignIO;
use strict;
my $in = Bio::AlignIO->new(-format => 'stockholm',
-file => 't/data/testaln.stockholm');
while( my $aln = $in->next_aln ) {
}
DESCRIPTION
This object can transform Bio::Align::AlignI objects to and from stockholm flat file databases. This has been completely refactored from the original stockholm parser to handle annotation data and now includes a write_aln() method for (almost) complete stockholm format output.Stockholm alignment records normally contain additional sequence-based and alignment-based annotation
GF Lines (alignment feature/annotation): #=GF <featurename> <Generic per-file annotation, free text> Placed above the alignment GC Lines (Alignment consensus) #=GC <featurename> <Generic per-column annotation, exactly 1 character per column> Placed below the alignment GS Lines (Sequence annotations) #=GS <seqname> <featurename> <Generic per-sequence annotation, free text> GR Lines (Sequence meta data) #=GR <seqname> <featurename> <Generic per-sequence AND per-column mark up, exactly 1 character per column>
Currently, sequence annotations (those designated with GS tags) are parsed only for accession numbers and descriptions. It is intended that full parsing will be added at some point in the near future along with a builder option for optionally parsing alignment annotation and meta data.
The following methods/tags are currently used for storing and writing the alignment annotation data.
Tag SimpleAlign Method ---------------------------------------------------------------------- AC accession ID id DE description ---------------------------------------------------------------------- Tag Bio::Annotation TagName Parameters Class ---------------------------------------------------------------------- AU SimpleValue record_authors value SE SimpleValue seed_source value GA SimpleValue gathering_threshold value NC SimpleValue noise_cutoff value TC SimpleValue trusted_cutoff value TP SimpleValue entry_type value SQ SimpleValue num_sequences value PI SimpleValue previous_ids value DC Comment database_comment comment CC Comment alignment_comment comment DR Target dblink database primary_id comment AM SimpleValue build_method value NE SimpleValue pfam_family_accession value NL SimpleValue sequence_start_stop value SS SimpleValue sec_structure_source value BM SimpleValue build_model value RN Reference reference * RC Reference reference comment RM Reference reference pubmed RT Reference reference title RA Reference reference authors RL Reference reference location ---------------------------------------------------------------------- * RN is generated based on the number of Bio::Annotation::Reference objects
Custom annotation
Some users may want to add custom annotation beyond those mapped above. Currently there are two methods to do so; however, the methods used for adding such annotation may change in the future, particularly if alignment Writer classes are introduced. In particular, do not rely on changing the global variables @WRITEORDER or %WRITEMAP as these may be made private at some point.1) Use (and abuse) the 'custom' tag. The tagname for the object can differ from the tagname used to store the object in the AnnotationCollection.
# AnnotationCollection from the SimpleAlign object my $coll = $aln->annotation; my $factory = Bio::Annotation::AnnotationFactory->new(-type => Bio::Annotation::SimpleValue'); my $rfann = $factory->create_object(-value => $str, -tagname => 'mytag'); $coll->add_Annotation('custom', $rfann); $rfann = $factory->create_object(-value => 'foo', -tagname => 'bar'); $coll->add_Annotation('custom', $rfann);
OUTPUT:
# STOCKHOLM 1.0 #=GF ID myID12345 #=GF mytag katnayygqelggvnhdyddlakfyfgaglealdffnnkeaaakiinwvaEDTTRGKIQDLV?? #=GF mytag TPtd~????LDPETQALLV???????????????????????NAIYFKGRWE?????????~?? #=GF mytag ??HEF?A?EMDTKPY??DFQH?TNen?????GRI??????V???KVAM??MF?????????N?? #=GF mytag ???DD?VFGYAEL????DE???????L??D??????A??TALELAY?????????????????? #=GF mytag ?????????????KG??????Sa???TSMLILLP???????????????D?????????????? #=GF mytag ???????????EGTr?????AGLGKLLQ??QL????????SREef??DLNK??L???AH????R #=GF mytag ????????????L????????????????????????????????????????R?????????R #=GF mytag ??QQ???????V???????AVRLPKFSFefefdlkeplknlgmhqafdpnsdvfklmdqavlvi #=GF mytag gdlqhayafkvd???????????????????????????????????????????????????? #=GF mytag ???????????????????????????????????????????????????????????????? #=GF mytag ???????????????????????????????????????????????????????????????? #=GF mytag ???????????????????????????????????????????????????????????????? #=GF mytag ?????????????INVDEAG?TEAAAATAAKFVPLSLppkt??????????????????PIEFV #=GF mytag ADRPFAFAIR??????E?PAT?G????SILFIGHVEDPTP?msv? #=GF bar foo ...
2) Modify the global @WRITEORDER and %WRITEMAP.
# AnnotationCollection from the SimpleAlign object my $coll = $aln->annotation; # add to WRITEORDER my @order = @Bio::AlignIO::stockholm::WRITEORDER; push @order, 'my_stuff'; @Bio::AlignIO::stockholm::WRITEORDER = @order; # make sure new tag maps to something $Bio::AlignIO::stockholm::WRITEMAP{my_stuff} = 'Hobbit/SimpleValue'; my $rfann = $factory->create_object(-value => 'Frodo', -tagname => 'Hobbit'); $coll->add_Annotation('my_stuff', $rfann); $rfann = $factory->create_object(-value => 'Bilbo', -tagname => 'Hobbit'); $coll->add_Annotation('my_stuff', $rfann);
OUTPUT:
# STOCKHOLM 1.0 #=GF ID myID12345 #=GF Hobbit Frodo #=GF Hobbit Bilbo ....
FEEDBACK
Support
Please direct usage questions or support issues to the mailing list:rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible.
Reporting Bugs
Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the web:
https://github.com/bioperl/bioperl-live/issues
AUTHORS - Chris Fields, Peter Schattner
Email: cjfields-at-uiuc-dot-edu, [email protected]CONTRIBUTORS
Andreas Kahari, ak-at-ebi.ac.uk Jason Stajich, jason-at-bioperl.orgAPPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _new
Title : new Usage : my $alignio = Bio::AlignIO->new(-format => 'stockholm' -file => '>file'); Function: Initialize a new L<Bio::AlignIO::stockholm> reader or writer Returns : L<Bio::AlignIO> object Args : -line_length : length of the line for the alignment block -alphabet : symbol alphabet to set the sequences to. If not set, the parser will try to guess based on the alignment accession (if present), defaulting to 'dna'. -spaces : (optional, def = 1) boolean to add a space in between the "# STOCKHOLM 1.0" header and the annotation and the annotation and the alignment.
next_aln
Title : next_aln Usage : $aln = $stream->next_aln() Function: returns the next alignment in the stream. Returns : L<Bio::Align::AlignI> object Args : NONE
write_aln
Title : write_aln Usage : $stream->write_aln(@aln) Function: writes the $aln object into the stream in stockholm format Returns : 1 for success and 0 for error Args : L<Bio::Align::AlignI> object
line_length
Title : line_length Usage : $obj->line_length($newval) Function: Set the alignment output line length Returns : value of line_length Args : newvalue (optional)
spaces
Title : spaces Usage : $obj->spaces(1) Function: Set the 'spaces' flag, which prints extra newlines between the header and the annotation and the annotation and the alignment Returns : sequence data type Args : newvalue (optional)
alignhandler
Title : alignhandler Usage : $stream->alignhandler($handler) Function: Get/Set the Bio::HandlerBaseI object Returns : Bio::HandlerBaseI Args : Bio::HandlerBaseI