man WWW::Curl (3): Perl extension interface for libcurl

SYNOPSIS

use WWW::Curl;
print $WWW::Curl::VERSION;

DESCRIPTION

WWW::Curl is a Perl extension interface for libcurl.

DOCUMENTATION

This module provides a Perl interface to libcurl. It is not intended to be a standalone module and because of this, the main libcurl documentation should be consulted for API details at <http://curl.haxx.se>. The documentation you're reading right now only contains the Perl specific details, some sample code and the differences between the C API and the Perl one.

WWW::Curl::Easy

The name might be confusing, it originates from libcurl. This is not an ::Easy module in the sense normally used on CPAN.

Here is a small snippet of making a request with WWW::Curl::Easy.

        use strict;
        use warnings;
        use WWW::Curl::Easy;
        my $curl = WWW::Curl::Easy->new;
        
        $curl->setopt(CURLOPT_HEADER,1);
        $curl->setopt(CURLOPT_URL, 'http://example.com');
        # A filehandle, reference to a scalar or reference to a typeglob can be used here.
        my $response_body;
        $curl->setopt(CURLOPT_WRITEDATA,\$response_body);
        # Starts the actual request
        my $retcode = $curl->perform;
        # Looking at the results...
        if ($retcode == 0) {
                print("Transfer went ok\n");
                my $response_code = $curl->getinfo(CURLINFO_HTTP_CODE);
                # judge result and next action based on $response_code
                print("Received response: $response_body\n");
        } else {
                # Error code, type of error, error message
                print("An error happened: $retcode ".$curl->strerror($retcode)." ".$curl->errbuf."\n");
        }

See curl_easy_setopt(3) for details of "setopt()".

WWW::Curl::Multi

        use strict;
        use warnings;
        use WWW::Curl::Easy;
        use WWW::Curl::Multi;
        my %easy;
        my $curl = WWW::Curl::Easy->new;
        my $curl_id = '13'; # This should be a handle unique id.
        $easy{$curl_id} = $curl;
        my $active_handles = 0;
        $curl->setopt(CURLOPT_PRIVATE,$curl_id);
        # do the usual configuration on the handle
        ...
        my $curlm = WWW::Curl::Multi->new;
        
        # Add some easy handles
        $curlm->add_handle($curl);
        $active_handles++;
        while ($active_handles) {
                my $active_transfers = $curlm->perform;
                if ($active_transfers != $active_handles) {
                        while (my ($id,$return_value) = $curlm->info_read) {
                                if ($id) {
                                        $active_handles--;
                                        my $actual_easy_handle = $easy{$id};
                                        # do the usual result/error checking routine here
                                        ...
                                        # letting the curl handle get garbage collected, or we leak memory.
                                        delete $easy{$id};
                                }
                        }
                }
        }

This interface is different than what the C API does. $curlm->perform is non-blocking and performs requests in parallel. The method does a little work and then returns control, therefor it has to be called periodically to get the job done. It's return value is the number of unfinished requests.

When the number of unfinished requests changes compared to the number of active handles, $curlm->info_read should be checked for finished requests. It returns one handle and it's return value at a time, or an empty list if there are no more finished requests. $curlm->info_read calls remove_handle on the given easy handle automatically, internally. The easy handle will still remain available until it goes out of scope, this action just detaches it from multi.

Please make sure that the easy handle does not get garbage collected until after the multi handle finishes processing it, or bad things happen.

The multi handle does not need to be cleaned up, when it goes out of scope it calls the required cleanup methods automatically.

It is possible to use $curlm->add_handle to add further requests to be processed after $curlm->perform has been called. WWW::Curl::Multi doesn't care about the order. It is possible to process all requests for a multi handle and then add a new batch of easy handles for processing.

WWW::Curl::Share

        use WWW::Curl::Share;
        my $curlsh = new WWW::Curl::Share;
        $curlsh->setopt(CURLSHOPT_SHARE, CURL_LOCK_DATA_COOKIE);
        $curlsh->setopt(CURLSHOPT_SHARE, CURL_LOCK_DATA_DNS);
        $curl->setopt(CURLOPT_SHARE, $curlsh);
        $curlsh->setopt(CURLSHOPT_UNSHARE, CURL_LOCK_DATA_COOKIE);
        $curlsh->setopt(CURLSHOPT_UNSHARE, CURL_LOCK_DATA_DNS);

WWW::Curl::Share is an extension to WWW::Curl::Easy which makes it possible to use a single cookies/dns cache for several Easy handles.

It's usable methods are:

        $curlsh = new WWW::Curl::Share
                This method constructs a new WWW::Curl::Share object.
        $curlsh->setopt(CURLSHOPT_SHARE, $value );
                Enables share for:
                        CURL_LOCK_DATA_COOKIE   use single cookies database
                        CURL_LOCK_DATA_DNS      use single DNS cache
        $curlsh->setopt(CURLSHOPT_UNSHARE, $value );
                Disable share for given $value (see CURLSHOPT_SHARE)
        $curlsh->strerror( ErrNo )
                This method returns a string describing the CURLSHcode error 
                code passed in the argument errornum.

This is how you enable sharing for a specific WWW::Curl::Easy handle:

        $curl->setopt(CURLOPT_SHARE, $curlsh)
                Attach share object to WWW::Curl::Easy instance

WWW::Curl::Form

    use WWW::Curl::Form;
    my $curlf = WWW::Curl::Form->new;
    $curlf->formaddfile($filename, 'attachment', "multipart/form-data");
    $curlf->formadd("FIELDNAME", "VALUE");
    $curl->setopt(CURLOPT_HTTPPOST, $curlf);

Its usable methods are:

    $curlf = new WWW::Curl::Form
        This method constructs a new WWW::Curl::Form object.
    $curlf->formadd(FIELDNAME, VALUE)
        This method adds a field with a given value, to the form that is being submitted.
    $curlf->formaddfile(FILENAME, DESCRIPTION, TYPE)
        This method will add a file to the form. The description is the name of the field
        that you form expects the data to be submitted in.

COMPATIBILITY

curl_easy_setopt: Most of the options should work, however some might not. Please send reports, tests and patches to fix those.
curl_easy_escape: Not implemented. Since equivalent Perl code is easily produced, this method will only made available for interface completeness, if ever.
curl_easy_init: Used only internally. The standard Perl way of initializing an object should be used,
"my $curl = WWW::Curl::Easy->new;".
curl_easy_cleanup: Used only internally. Curl object cleanup happens when the handle goes out of scope.
curl_easy_duphandle: Should be working for most cases, however do not change the value of options which accept a list/arrayref value on a duped handle, otherwise memory leaks or crashes will happen. This behaviour will be fixed in the future.
curl_easy_pause: Not implemented.
curl_easy_reset: Not implemented.
curl_easy_unescape: Not implemented. Trivial Perl replacements are available.
curl_escape: Not implemented and won't be as this method is considered deprecated.
curl_formadd: Seems to be working.
curl_formaddfile: Seems to be working.
curl_formfree: Used internally. Not exposed through the public API, as this call has no relevance to Perl code.
curl_free: Used internally. Not exposed through the public API, as this call has no relevance to Perl code.
curl_getdate: Not implemented. This function is easily replaced by Perl code and as such, most likely it won't be implemented.
curl_global_cleanup: Only used internally, not exposed through the public API.
curl_global_init: Only used internally, not exposed through the public API.
curl_global_init_mem: Not implemented.
curl_global_cleanup: Only used internally and called automatically upon exit.
curl_slist_append: Only used internally, not exposed through the public API.
curl_slist_free_all: Only used internally, not exposed through the public API.
curl_unescape: Not implemented and won't be, as this method is considered deprecated.
curl_version: Seems to work.
curl_version_info: Not yet implemented.
curl_multi_*: Most methods are either not exposed through the WWW::Curl::Multi API or they behave differently than it's C counterpart. Please see the section about WWW::Curl::Multi above.
curl_multi_fdset: This method returns three arrayrefs: the read, write and exception fds libcurl knows about. In the case of no file descriptors in the given set, an empty array is returned.

NUANCES

Header output for redirects

It might be surprising that if "CURLOPT_FOLLOWLOCATION" is set and header output was enabled, headers show up for all http responses. The reasoning behind that and possible code adjustments are outlined here: <https://rt.cpan.org/Ticket/Display.html?id=61569>.

CURLOPT_PRIVATE

Despite what the libcurl manual says, in Perl land, only string values are suitable for this option.

ADDITIONAL METHODS

On WWW::Curl::Easy objects

pushopt: Like "setopt" but instead of overriding any previously set values it adds it to the end. Can be used with "CURLOPT_HTTPHEADER", "CURLOPT_QUOTE" and "CURLOPT_POSTQUOTE".

USAGE CASES

WWW::Curl is a thin binding on top of libcurl, to make using libcurl possible from Perl land. Because of this, the module is less like Perl and more like C in coding style.

There is a new module, <http://search.cpan.org/perldoc?WWW::Curl::Simple>, which wraps this module into a more Perlish and userfriendly package.

The standard Perl WWW module, LWP should probably be used in most cases to work with HTTP or FTP from Perl. However, there are some cases where LWP doesn't perform well. One is speed and the other is parallelism. WWW::Curl is much faster, uses much less CPU cycles and it's capable of non-blocking parallel requests.

In some cases, for example when building a web crawler, cpu usage and parallel downloads are important considerations. It can be desirable to use WWW::Curl to do the heavy-lifting of a large number of downloads and wrap the resulting data into a Perl-friendly structure by HTTP::Response or use WWW::Curl::Simple to do that for you.

CHANGES

Version 4.01 - 4.07 adds several bugfixes and extends functionality coverage. See Changes file.

Version 4.00 added new documentation, the build system changed to Module::Install, the test suite was rewritten to use Test::More, a new calling syntax for WWW::Curl::Multi was added, memory leak and other bugfixes added, Perl 5.6 and libcurl 7.10.8 as minimum requirements for this module were set.

Version 3.12 is a bugfix for a missing Share.pm.in file in the release.

Version 3.11 added WWW::Curl::Share.

Version 3.10 adds the WWW::Curl::Share interface by Anton Federov and large file options after a contribution from Mark Hindley.

Version 3.02 adds some backwards compatibility for scripts still using 'WWW::Curl::easy' names.

Version 3.01 added some support for pre-multi versions of libcurl.

Version 3.00 adds WWW::Curl::Multi interface, and new module names following perl conventions (WWW::Curl::Easy rather than WWW::Curl::easy), by Sebastian Riedel <sri at cpan.org>.

Version 2.00 of WWW::Curl::easy is a renaming of the previous version (named Curl::easy), to follow CPAN naming guidelines, by Cris Bailiff.

Versions 1.30, a (hopefully) threadable, object-oriented, multiple-callback compatible version of Curl::easy was substantially reworked from the previous Curl::easy release (1.21) by Cris Bailiff.

AUTHORS

Currently maintained by Cris Bailiff <c.bailiff+curl at devsecure.com> and Balint Szilakszi <szbalint at cpan.org>.

Original Author Georg Horn <[email protected]>, with additional callback, pod and test work by Cris Bailiff <[email protected]> and Forrest Cahoon <[email protected]>. Sebastian Riedel added ::Multi and Anton Fedorov (datacompboy <at> mail.ru) added ::Share. Balint Szilakszi repackaged the module into a more modern form.

COPYRIGHT

You may opt to use, copy, modify, merge, publish, distribute and/or sell copies of the Software, and permit persons to whom the Software is furnished to do so, under the terms of the MIT license.