SYNOPSIS
# Do not use this module directly. Use it via the Bio::SeqIO class.
use Bio::SeqIO;
# read a SeqXML file
my $seqio = Bio::SeqIO->new(-format => 'seqxml',
-file => 'my_seqs.xml');
while (my $seq_object = $seqio->next_seq) {
print join("\t",
$seq_object->display_id,
$seq_object->description,
$seq_object->seq,
), "\n";
}
# write a SeqXML file
#
# Note that you can (optionally) specify the source
# (usually a database) and source version.
my $seqwriter = Bio::SeqIO->new(-format => 'seqxml',
-file => ">outfile.xml",
-source => 'Ensembl',
-sourceVersion => '56');
$seqwriter->write_seq($seq_object);
# once you've written all of your seqs, you may want to do
# an explicit close to get the closing </seqXML> tag
$seqwriter->close;
DESCRIPTION
This object can transform Bio::Seq objects to and from SeqXML format. For more information on the SeqXML standard, visit <http://www.seqxml.org>.In short, SeqXML is a lightweight sequence format that takes advantage of the validation capabilities of XML while not overburdening you with a strict and complicated schema.
This module is based in part (particularly the XML-parsing part) on Bio::TreeIO::phyloxml by Mira Han.
FEEDBACK
Mailing Lists
User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated.
[email protected] - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists
Support
Please direct usage questions or support issues to the mailing list:rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible.
Reporting Bugs
Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the web:
https://github.com/bioperl/bioperl-live/issues
AUTHORS - Dave Messina
Email: [email protected]CONTRIBUTORS
APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a __initialize
Title : _initialize Usage : $self->_initialize(@args) Function: constructor (for internal use only). Besides the usual SeqIO arguments (-file, -fh, etc.), Bio::SeqIO::seqxml accepts three arguments which are used when writing out a seqxml file. They are all optional. Returns : none Args : -source => source string (usually a database name) -sourceVersion => source version. The version number of the source -seqXMLversion => the version of seqXML that will be used Throws : Exception if XML::LibXML::Reader or XML::Writer is not initialized
next_seq
Title : next_seq Usage : $seq = $stream->next_seq() Function: returns the next sequence in the stream Returns : L<Bio::Seq> object, or nothing if no more available Args : none
write_seq
Title : write_seq Usage : $stream->write_seq(@seq) Function: Writes the $seq object into the stream Returns : 1 for success and 0 for error Args : Array of 1 or more L<Bio::PrimarySeqI> objects
_initialize_seqxml_node_methods
Title : _initialize_seqxml_node_methods Usage : $self->_initialize_xml_node_methods Function: sets up code ref mapping of each seqXML node type to a method for processing that node type Returns : none Args : none
schemaLocation
Title : schemaLocation Usage : $self->schemaLocation Function: gets/sets the schema location in the <seqXML> header Returns : the schema location string Args : To set the schemaLocation, call with a schemaLocation as the argument.
source
Title : source Usage : $self->source Function: gets/sets the data source in the <seqXML> header Returns : the data source string Args : To set the source, call with a source string as the argument.
sourceVersion
Title : sourceVersion Usage : $self->sourceVersion Function: gets/sets the data source version in the <seqXML> header Returns : the data source version string Args : To set the source version, call with a source version string as the argument.
seqXMLversion
Title : seqXMLversion Usage : $self->seqXMLversion Function: gets/sets the seqXML version in the <seqXML> header Returns : the seqXML version string. Args : To set the seqXML version, call with a seqXML version string as the argument.
Methods for parsing the XML document
processXMLNode
Title : processXMLNode Usage : $seqio->processXMLNode Function: reads the XML node and processes according to the node type Returns : none Args : none Throws : Exception on unexpected XML node type, warnings on unexpected XML element names.
processAttribute
Title : processAttribute Usage : $seqio->processAttribute(\%hash_for_attribute); Function: reads the attributes of the current element into a hash Returns : none Args : hash reference where the attributes will be stored.
parseHeader
Title : parseHeader Usage : $self->parseHeader(); Function: reads the opening <seqXML> block and grabs the metadata from it, namely the source, sourceVersion, and seqXMLversion. Returns : none Args : none Throws : Exception if it hits an <entry> tag, because that means it's missed the <seqXML> tag and read too far into the file.
element_seqXML
Title : element_seqXML Usage : $self->element_seqXML Function: processes the opening <seqXML> node Returns : none Args : none
element_entry
Title : element_entry Usage : $self->element_entry Function: processes a sequence <entry> node Returns : none Args : none Throws : Exception if sequence ID is not present in <entry> element
element_species
Title : element_entry Usage : $self->element_entry Function: processes a <species> node, creating a Bio::Species object Returns : none Args : none Throws : Exception if <species> tag exists but is empty, or if the attributes 'name' or 'ncbiTaxID' are undefined
element_description
Title : element_description Usage : $self->element_description Function: processes a sequence <description> node; a no-op -- description text is read by processXMLnode Returns : none Args : none
element_RNAseq
Title : element_RNAseq Usage : $self->element_RNAseq Function: processes a sequence <RNAseq> node Returns : none Args : none
element_DNAseq
Title : element_DNAseq Usage : $self->element_DNAseq Function: processes a sequence <DNAseq> node Returns : none Args : none
element_AAseq
Title : element_AAseq Usage : $self->element_AAseq Function: processes a sequence <AAseq> node Returns : none Args : none
element_DBRef
Title : element_DBRef Usage : $self->element_DBRef Function: processes a sequence <DBRef> node, creating a Bio::Annotation::DBLink object Returns : none Args : none
element_property
Title : element_property Usage : $self->element_property Function: processes a sequence <property> node, creating a Bio::Annotation::SimpleValue object Returns : none Args : none
end_element_RNAseq
Title : end_element_RNAseq Usage : $self->end_element_RNAseq Function: processes a sequence <RNAseq> node Returns : none Args : none
end_element_DNAseq
Title : end_element_DNAseq Usage : $self->end_element_DNAseq Function: processes a sequence <DNAseq> node Returns : none Args : none
end_element_AAseq
Title : end_element_AAseq Usage : $self->end_element_AAseq Function: processes a sequence <AAseq> node Returns : none Args : none
end_element_entry
Title : end_element_entry Usage : $self->end_element_entry Function: processes the closing </entry> node, creating the Seq object Returns : a Bio::Seq object Args : none Throws : Exception if sequence, sequence ID, or alphabet are missing
end_element_default
Title : end_element_default Usage : $self->end_element_default Function: processes all other closing tags; a no-op. Returns : none Args : none
DESTROY
Title : DESTROY Usage : called automatically by Perl just before object goes out of scope Function: performs a write flush Returns : none Args : none
close
Title : close Usage : $seqio_obj->close(). Function: writes closing </seqXML> tag. close() will be called automatically by Perl when your program exits, but if you want to use the seqXML file you've written before then, you'll need to do an explicit close first to get the final </seqXML> tag. Returns : none Args : none