File::MMagic(3) Guess file type

SYNOPSIS


use File::MMagic;
use FileHandle;
$mm = new File::MMagic; # use internal magic file
# $mm = File::MMagic->new('/etc/magic'); # use external magic file
# $mm = File::MMagic->new('/usr/share/etc/magic'); # if you use Debian
$res = $mm->checktype_filename("/somewhere/unknown/file");
$fh = new FileHandle "< /somewhere/unknown/file2";
$res = $mm->checktype_filehandle($fh);
$fh->read($data, 0x8564);
$res = $mm->checktype_contents($data);

ABSTRACT

This perl library uses perl5 objects to guess file type from filename and/or filehandle.

DESCRIPTION

checktype_filename(), checktype_filehandle() and checktype_contents returns string contains file type with MIME mediatype format.

METHODS

File::MMagic->new()
File::MMagic->new( $filename )
Initializes the module. If no filename is given, the magic numbers stored in File::MMagic are used.
$mm->addSpecials
If a filetype cannot be determined by magic numbers, extra checks are done based on extra regular expressions which can be defined here. The first argument should be the filetype, the remaining arguments should be one or more regular expressions.

By default, checks are done for message/news, message/rfc822, text/html, text/x-roff.

$mm->removeSpecials
Removes special regular expressions. Specify one or more filetypes. If no filetypes are specified, all special regexps are removed.

Returns a hash containing the removed entries.

$mm->addFileExts
If a filetype cannot be determined by magic numbers, extra checks can be done based on the file extension (actually, a regexp). Two arguments should be geiven: the filename pattern and the corresponding filetype.

By default, checks are done for application/x-compress, application/x-bzip2, application/x-gzip, text/html, text/plain

$mm->removeFileExts
Remove filename pattern checks. Specify one or more patterns. If no pattern is specified, all are removed.

Returns a hash containing the removed entries.

$mm->addMagicEntry
Add a new magic entry in the object. The format is same as magic(5) file.

  Ex.
  # Add a entry
  $mm->addMagicEntry("0\tstring\tabc\ttext/abc");
  # Add a entry with a sub entry
  $mm->addMagicEntry("0\tstring\tdef\t");
  $mm->addMagicEntry(">10\tstring\tghi\ttext/ghi");
$mm->readMagicHandle
$mm->checktype_filename
$mm->checktype_magic
$mm->checktype_contents

COPYRIGHT

This program is originated from file.kulp that is a production of The Unix Reconstruction Projct.
   <http://language.perl.com/ppt/index.html> Copyright (c) 1999 NOKUBI Takatsugu <[email protected]>.

There is no warranty for the program.

This product includes software developed by the Apache Group for use in the Apache HTTP server project (http://www.apache.org/).

License for the program is followed the original software. The license is below.

This program is free and open software. You may use, copy, modify, distribute and sell this program (and any modified variants) in any way you wish, provided you do not restrict others to do the same, except for the following consideration.

I read some of Ian F. Darwin's BSD C implementation, to try to determine how some of this was done since the specification is a little vague. I don't believe that this perl version could be construed as an ``altered version'', but I did grab the tokens for identifying the hard-coded file types in names.h and copied some of the man page.

Here's his notice:

 * Copyright (c) Ian F. Darwin, 1987.
 * Written by Ian F. Darwin.
 *
 * This software is not subject to any license of the American Telephone
 * and Telegraph Company or of the Regents of the University of California.
 *
 * Permission is granted to anyone to use this software for any purpose on
 * any computer system, and to alter it and redistribute it freely, subject
 * to the following restrictions:
 *
 * 1. The author is not responsible for the consequences of use of this
 *    software, no matter how awful, even if they arise from flaws in it.
 *
 * 2. The origin of this software must not be misrepresented, either by
 *    explicit claim or by omission.  Since few users ever read sources,
 *    credits must appear in the documentation.
 *
 * 3. Altered versions must be plainly marked as such, and must not be
 *    misrepresented as being the original software.  Since few users
 *    ever read sources, credits must appear in the documentation.
 *
 * 4. This notice may not be removed or altered.

The following is the Apache License. This program contains the magic file that derived from the Apache HTTP Server.

 * Copyright (c) 1995-1999 The Apache Group.  All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 *
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 *
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in
 *    the documentation and/or other materials provided with the
 *    distribution.
 *
 * 3. All advertising materials mentioning features or use of this
 *    software must display the following acknowledgment:
 *    "This product includes software developed by the Apache Group
 *    for use in the Apache HTTP server project (http://www.apache.org/)."
 *
 * 4. The names "Apache Server" and "Apache Group" must not be used to
 *    endorse or promote products derived from this software without
 *    prior written permission. For written permission, please contact
 *    [email protected].
 *
 * 5. Products derived from this software may not be called "Apache"
 *    nor may "Apache" appear in their names without prior written
 *    permission of the Apache Group.
 *
 * 6. Redistributions of any form whatsoever must retain the following
 *    acknowledgment:
 *    "This product includes software developed by the Apache Group
 *    for use in the Apache HTTP server project (http://www.apache.org/)."
 *
 * THIS SOFTWARE IS PROVIDED BY THE APACHE GROUP ``AS IS'' AND ANY
 * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
 * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE APACHE GROUP OR
 * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
 * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
 * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
 * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
 * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
 * OF THE POSSIBILITY OF SUCH DAMAGE.