CGI::Uploader(3) Manage CGI uploads using SQL database

Synopsis

 use CGI::Uploader::Transform::ImageMagick 'gen_thumb';
 my $u = CGI::Uploader->new(
    spec       => {
        # Upload one image named from the form field 'img'
        # and create one thumbnail for it.
        img_1 => {
            gen_files => {
                'img_1_thmb_1' => gen_thumb({ w => 100, h => 100 }),
              }
        },
    },
    updir_url  => 'http://localhost/uploads',
    updir_path => '/home/user/www/uploads',
        temp_dir   => '/home/user/www/uploads',
    dbh        => $dbh,
    query      => $q, # defaults to CGI->new(),
 );
 # ... now do something with $u

Description

This module is designed to help with the task of managing files uploaded through a CGI application. The files are stored on the file system, and the file attributes stored in a SQL database.

Introduction and Recipes

The CGI::Uploader::Cookbook provides a slightly more in depth introduction and recipes for a basic BREAD web application. (Browse, Read, Edit, Add, Delete).

Constructor

new()

 my $u = CGI::Uploader->new(
    spec       => {
         # The first image has 2 different sized thumbnails
           img_1 => {
             gen_files => {
                     'img_1_thmb_1' => gen_thumb({ w => 100, h => 100 }),
                     'img_1_thmb_2' => gen_thumb({ w => 50, h => 50 }),
             }
           },
       },
        # Just upload it
        img_2 => {},
        # Downsize the large image to these maximum dimensions if it's larger
        img_3 => {
            # Besides generating dependent files
            # We can also transform the file itself
            # Here, we shrink the image to be wider than 380
            transform_method => \&gen_thumb,
            # demostrating the old-style param passing
            params => [{ w => 380 }],
        }
    },
    updir_url  => 'http://localhost/uploads',
    updir_path => '/home/user/www/uploads',
    dbh        => $dbh,
    query      => $q, # defaults to CGI->new(),
    up_table   => 'uploads', # defaults to "uploads"
    up_seq     => 'upload_id_seq',  # Required for Postgres
 );
spec [required]
The specification described the examples above. The keys correspond to form field names for upload fields.

The values are hash references. The simplest case is an empty hash reference, which means to just upload the image and apply no transformations.

#####

Each key in the hash is the corresponds to a file upload field. The values are hash references used provide options for how to transform the file, and possibly generate additional files based on it.

Valid keys here are:

transform_method
This is a subroutine reference. This routine can be used to transform the upload before it is stored. The first argument given to the routine will be the CGI::Uploader object. The second will be a full path to a file name containing the upload.

Additional arguments can be passed to the subroutine using "params", as in the example above. But don't do that, it's ugly. If you need a custom transform method, write a little closure for it like this:

  sub my_transformer {
      my %args = @_;
      return sub {
          my ($self, $file) = shift;
          # do something with $file and %args here...
          return $path_to_new_file_i_made;
      }

Then in the spec you can put:

 transform_method => my_tranformer(%args),

It must return a full path to a transformed file.

}

params (DEPRECATED)
NOTE: Using a closure based interface provides a cleaner alternative to using params. See CGI::Uploader::Transform::ImageMagick for an example.

Used to pass additional arguments to "transform_method". See above.

Each method used may have additional documentation about parameters that can be passed to it.

gen_files
A hash reference to describe files generated from a particular upload. The keys are unique identifiers for the generated files. The values are code references (usually closures) that prove a transformation for the file. See CGI::Uploader::Transform::ImageMagick for an an example.

An older interface for "gen_files" is deprecated. For that, the values are hashrefs, containing keys named "transform_method" and "params", which work as described above to generate a transformed version of the file.

updir_url [required]
URL to upload storage directory. Should not include a trailing slash.
updir_path [required]
File system path to upload storage directory. Should not include a trailing slash.
temp_dir
Optional file system path to temporary directory. Default is File::Spec->tmpdir(). This temporary directory will also be used by gen_files during image transforms.
dbh [required]
DBI database handle. Required.
query
A CGI.pm-compatible object, used for the "param" and "upload" functions. Defaults to CGI->new() if omitted.
up_table
Name of the SQL table where uploads are stored. See example syntax above or one of the creation scripts included in the distribution. Defaults to ``uploads'' if omitted.
up_table_map
A hash reference which defines a mapping between the column names used in your SQL table, and those that CGI::Uploader uses. The keys are the CGI::Uploader default names. Values are the names that are actually used in your table.

This is not required. It simply allows you to use custom column names.

  upload_id       => 'upload_id',
  mime_type       => 'mime_type',
  extension       => 'extension',
  width           => 'width',
  height          => 'height',
  gen_from_id     => 'gen_from_id',
  file_name       => 'file_name',

You may also define additional column names with a value of 'undef'. This feature is only useful if you override the "extract_meta()" method or pass in $shared_meta to store_uploads(). Values for these additional columns will then be stored by "store_meta()" and retrieved with "fk_meta()".

up_seq
For Postgres only, the name of a sequence used to generate the upload_ids. Defaults to "upload_id_seq" if omitted.
file_scheme
 file_scheme => 'md5',

"file_scheme" controls how file files are stored on the file system. The default is "simple", which stores all the files in the same directory with names like "123.jpg". Depending on your environment, this may be sufficient to store 10,000 or more files.

As an alternative, you can specify "md5", which will create three levels of directories based on the first three letters of the ID's md5 sum. The result may look like this:

 2/0/2/123.jpg

This should scale well to millions of files. If you want even more control, consider overriding the "build_loc()" method, which is used to return the stored file path.

Note that specifying the file storage scheme for the file system is not related to the "file_name" stored in the database, which is always the original uploaded file name.

Basic Methods

These basic methods are all you need to know to make effective use of this module.

store_uploads()

  my $entity = $u->store_uploads($form_data);

Stores uploaded files based on the definition given in "spec".

Specifically, it does the following:

  • possibily transforms the original file according to "transform_method"
  • possibly generates additional files based on those uploaded, according to "gen_files".
  • stores all the files on the file system
  • inserts upload details into the database, including upload_id, mime_type and extension. The columns 'width' and 'height' will be populated if that meta data is available.

As input, a hash reference of form data is expected. The simplest way to get this is like this:

 use CGI;
 my $q = new CGI;
 $form_data = $q->Vars;

However, I recommend that you validate your data with a module with Data::FormValidator, and use a hash reference of validated data, instead of directly using the CGI form data.

CGI::Uploader is designed to handle uploads that are included as a part of an add/edit form for an entity stored in a database. So, $form_data is expected to contain additional fields for this entity as well as the file upload fields.

For this reason, the "store_uploads" method returns a hash reference of the valid data with some transformations. File upload fields will be removed from the hash, and corresponding ``_id'' fields will be added.

So for a file upload field named 'img_field', the 'img_field' key will be removed from the hash and 'img_field_id' will be added, with the appropriate upload ID as the value.

store_uploads takes an optional second argument as well:

  my $entity = $u->store_uploads($form_data,$shared_meta);

This is a hash refeference of additional meta data that you want to store for all of the images you storing. For example, you may wish to store an ``uploaded_user_id''.

The keys should be column names that exist in your "uploads" table. The values should be appropriate data for the column. Only the key names defined by the "up_table_map" in "new()" will be used. Other values in the hash will be ignored.

delete_checked_uploads()

 my @fk_col_names = $u->delete_checked_uploads;

This method deletes all uploads and any generated files based on form input. Both files and meta data are removed.

It looks through all the field names defined in "spec". For an upload named img_1, a field named img_1_delete is checked to see if it has a true value.

A list of the field names is returned, prepended with '_id', such as:

 img_1_id

The expectation is that you have foreign keys with these names defined in another table. Having the names is format allows you to easily set these fields to NULL in a database update:

 map { $entity->{$_} = undef } @fk_names;

NOTE: This method can not currently be used to delete a generated file by itself.

fk_meta()

 my $href = $u->fk_meta(
    table    => $table,
    where    => \%where,
    prefixes => \@prefixes,

Returns a hash reference of information about the file, useful for passing to a templating system. Here's an example of what the contents of $href might look like:

 {
     file_1_id     => 523,
     file_1_url    => 'http://localhost/images/uploads/523.pdf',
 }

If the files happen to be images and have their width and height defined in the database row, template variables will be made for these as well.

This is going to fetch the file information from the upload table for using the row where news.item_id = 23 AND news.file_1_id = uploads.upload_id.

This is going to fetch the file information from the upload table for using the row where news.item_id = 23 AND news.file_1_id = uploads.upload_id.

The %where hash mentioned here is a SQL::Abstract where clause. The complete SQL that used to fetch the data will be built like this:

 SELECT upload_id as id,width,height,extension
    FROM uploads, $table
    WHERE (upload_id = ${prefix}_id AND (%where_clause_expanded here));

Class Methods

These are some handy class methods that you can use without the need to first create an object using "new()".

upload()

 # As a class method
 ($tmp_filename,$uploaded_mt,$file_name) =
    CGI::Uplooader->upload('file_field',$q);
 # As an object method
 ($tmp_filename,$uploaded_mt,$file_name) =
    $u->upload('file_field');

The function is responsible for actually uploading the file.

It can be called as a class method or an object method. As a class method, it's necessary to provide a query object as the second argument. As an object method, the query object given the constructor is used.

Input:
 - file field name

Output:
 - temporary file name
 - Uploaded MIME Type
 - Name of uploaded file (The value of the file form field)

Currently CGI.pm, CGI::Simple and Apache::Request and are supported.

Upload Methods

These methods are high level methods to manage the file and meta data parts of an upload, as well its generated files. If you are doing something more complex or customized you may want to call or overide one of the below methods.

store_upload()

 my %entity_upload_extra = $u->store_upload(
    file_field    => $file_field,
    src_file      => $tmp_filename,
    uploaded_mt   => $uploaded_mt,
    file_name     => $file_name,
    shared_meta   => $shared_meta,  # optional
    id_to_update  => $id_to_update, # optional
 );

Does all the processing for a single upload, after it has been uploaded to a temp file already.

It returns a hash of key/value pairs as described in ``store_uploads()''.

create_store_gen_files()

 my %gen_file_ids = $u->create_store_gen_files(
        file_field      => $file_field,
        meta            => $meta_href,
        src_file        => $tmp_filename,
        gen_from_id => $gen_from_id,
    );

This method is responsible for creating and storing any needed thumbnails.

Input:
 - file_field: file field name
 - meta: a hash ref of meta data, as "extract_meta" would produce
 - src_file: path to temporary file of the file upload
 - gen_from_id: ID of upload that generated files  will be made from

delete_upload()

  $u->delete_upload($upload_id);

This method is used to delete the meta data and file associated with an upload. Usually it's more convenient to use "delete_checked_uploads" than to call this method directly.

This method does not delete generated files for this upload.

delete_gen_files()

 $self->delete_gen_files($id);

Delete the generated files for a given file ID, from the file system and the database

Meta-data Methods

extract_meta()

 $meta = $self->extract_meta($tmp_filename,$file_name,$uploaded_mt);

This method extracts and returns the meta data about a file and returns it.

Input:

 - Path to file to extract meta data from
 - the name of the file (as sent through the file upload file)
 - The mime-type of the file, as supplied by the browser

Returns: a hash reference of meta data, following this example:

 {
         mime_type => 'image/gif',
         extension => '.gif',
         bytes     => 60234,
         file_name => 'happy.txt',
         # only for images
         width     => 50,
         height    => 50,
 }

store_meta()

 my $id = $self->store_meta($file_field,$meta);

This function is used to store the meta data of a file upload.

Input:

 - file field name
 - A hashref of key/value pairs to be stored. Only the key names defined by the
   C<up_table_map> in C<new()> will be used. Other values in the hash will be
   ignored.
 - Optionally, an upload ID can be passed, causing an 'Update' to happen instead of an 'Insert'

Output:
  - The id of the file stored. The id is generated by store_meta().

delete_meta()

 my $dbi_rv = $self->delete_meta($id);

Deletes the meta data for a file and returns the DBI return value for this operation.

transform_meta()

 my %meta_to_display = $u->transform_meta(
        meta   => $meta_from_db,
        prefix => 'my_field',
        prevent_browser_caching => 0,
        fields => [qw/id url width height/],
    );

Prepares meta data from the database for display.

Input:
 - meta:   A hashref, as might be returned from ``SELECT * FROM uploads WHERE upload_id = ?''

 - prefix: the resulting hashref keys will be prefixed with this,
   adding an underscore as well.
 - prevent_browse_caching: If set to true, a random query string
   will be added, preventing browsings from caching the image. This is very
   useful when displaying an image an 'update' page. Defaults to true.
 - fields: An arrayef of fields to format. The values here must be
   keys in the C<up_table_map>. Two field names are special. 'C<id> is
   used to denote the upload_id. C<url> combines several fields into
   a URL to link to the upload.

Output:
 - A formatted hash.

See ``fk_meta()'' for example output.

get_meta()

 my $meta_href = $self->get_meta($id);

Returns a hashref of data stored in the uploads database table for the requested file id.

File Methods

store_file()

 $self->store_file($file_field,$tmp_file,$id,$ext);

Stores an upload file or dies if there is an error.

Input:
  - file field name
  - path to tmp file for uploaded image
  - file id, as generated by "store_meta()"
  - file extension, as discovered by extract_meta()

Output: none

delete_file()

 $self->delete_file($id);

Call from within "delete_upload", this routine deletes the actual file. Dont' delete the the meta data first, you may need it build the path name of the file to delete.

Utility Methods

build_loc()

 my $up_loc = $self->build_loc($id,$ext);

Builds a path to access a single upload, relative to "updir_path". This is used to both file-system and URL access. Also see the "file_scheme" option to "new()", which affects it's behavior.

upload_field_names()

 # As a class method
 (@file_field_names) = CGI::Uploader->upload_field_names($q);
 # As an object method
 (@file_field_names) = $u->upload_field_names();

Returns the names of all form fields which contain file uploads. Empty file upload fields may be excluded.

This can be useful for auto-generating a "spec".

Input:
 - A query object is required as  input only when called as a class method.

Output:
 - an array of the file upload field names.

spec_names()

 $spec_names = $u->spec_names('file_field'):

With no arguments, returns an array of all the upload names defined in the spec, including any generated file names.

With one argument, a file field from the spec, can also be provided. It then returns that name as well as the names of any related generated files.

Contributing

Patches, questions and feedback are welcome. I maintain CGI::Uploader using git. The public repo is here: https://github.com/markstos/CGI---Uploader

Author

Mark Stosberg <[email protected]>

Thanks

A special thanks to David Manura for his detailed and persistent feedback in the early days, when the documentation was wild and rough.

Barbie, for the first patch.

License

This program is free software; you can redistribute it and/or modify it under the terms as Perl itself.