Extracts Dbxrefs from GFF3 lines that have Target attributes


% gff_file_name > output_file


For GFF3 lines of the form:

 chr1 CDNA  cDNA_match  69388   69593  0  -  .  Dbxref=Sorghum_CDNA:Contig_448;Target=Contig_448 75 295 +

that is, that have both Target and Dbxref attributes, this script extracts the Dbxref value and prints out a list of the database and accession parts of the Dbxref value. This functionality depends on a standard format for the Dbxref value, one where the name of the database preceeds the accession and are separated by a colon.


Another script,, takes a list of databases and accessions (like this script provides) and a directory of FASTA files and builds a GFF3 file that corresponds to those targets. The use for these files is to load them into Chado before that compuational analysis results are loaded to ensure that the database has a complete picture of the analysis performed.




Scott Cain <[email protected]>

Copyright (c) 2007

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.