Mail::Field::Received(3) mostly RFC822-compliant parser of Received headers

SYNOPSIS


use Mail::Field;
my $received = Mail::Field->new('Received', $header);
my $results = $received->parse_tree();
my $parsed_ok = $received->parsed_ok();
my $diagnostics = $received->diagnostics();

DESCRIPTION

Don't use this class directly! Instead ask Mail::Field for new instances based on the field name!

Mail::Field::Received provides subroutines for parsing Received headers from e-mails. It mostly complies with RFC822, but deviates to accommodate a number of broken MTAs which are in common use. It also attempts to extract useful information which MTAs often embed within the "(comments)".

It is a subclass derived from the Mail::Field and Mail::Field::Generic classes.

ROUTINES

  • debug

    Returns current debugging level obtained via the "diagnostics" method. If a parameter is given, the debugging level is changed. The default level is 3.

  • diagnose

      $received->diagnose("foo", "\n");
    

    Appends stuff to the parser's diagnostics buffer.

  • diagnostics

      my $diagnostics = $received->diagnostics();
    

    Returns the contents of the parser's diagnostics buffer.

  • parse

    The actual parser. Returns the object (Mail::Field barfs otherwise).

  • parsed_ok

      if ($received->parsed_ok()) {
        ...
      }
    

    Returns true if the parse succeed, or if it failed, but was permitted to fail for some reason, such as encountering evidence of a known broken (non-RFC822-compliant) format mid-parse.

  • parse_tree

      my $parse_tree = $received->parse_tree();
    

    Returns the actual parse tree, which is where you get all the useful information. It is returned as a hashref whose keys are strings like `from', `by', `with', `id', `via' etc., corresponding to the components of Received headers as defined by RFC822:

      received    =  "Received"    ":"            ; one per relay
                        ["from" domain]           ; sending host
                        ["by"   domain]           ; receiving host
                        ["via"  atom]             ; physical path
                       *("with" atom)             ; link/mail protocol
                        ["id"   msg-id]           ; receiver msg id
                        ["for"  addr-spec]        ; initial form
                         ";"    date-time         ; time received
    

    The corresponding values are more hashrefs which are mini-parse-trees for these individual components. A typical parse tree looks something like:

      {
       'by' => {
                'domain' => 'host5.hostingcheck.com',
                'whole' => 'by host5.hostingcheck.com',
                'comments' => [
                               '(8.9.3/8.9.3)'
                              ],
               },
       'date_time' => {
                       'year' => 2000,
                       'week_day' => 'Tue',
                       'minute' => 57,
                       'day_of_year' => '1 Feb',
                       'month_day' => ' 1',
                       'zone' => '-0500',
                       'second' => 18,
                       'hms' => '21:57:18',
                       'date_time' => 'Tue, 1 Feb 2000 21:57:18 -0500',
                       'hour' => 21,
                       'month' => 'Feb',
                       'rest' => '2000 21:57:18 -0500',
                       'whole' => 'Tue, 1 Feb 2000 21:57:18 -0500'
                      },
       'with' => {
                  'with' => 'ESMTP',
                  'whole' => 'with ESMTP'
                 },
       'from' => {
                  'domain' => 'mediacons.tecc.co.uk',
                  'HELO' => 'tr909.mediaconsult.com',
                  'from' => 'tr909.mediaconsult.com',
                  'address' => '193.128.6.132',
                  'comments' => [
                                 '(mediacons.tecc.co.uk [193.128.6.132])',
                                ],
                  'whole' => 'from tr909.mediaconsult.com (mediacons.tecc.co.uk [193.128.6.132])
    '  
                 },
       'id' => {
                'id' => 'VAA24164',
                'whole' => 'id VAA24164'
               },
       'comments' => [
                      '(mediacons.tecc.co.uk [193.128.6.132])',
                      '(8.9.3/8.9.3)'
                     ],
       'for' => {
                 'for' => '<[email protected]>',
                 'whole' => 'for <[email protected]>'
                },
       'whole' => 'from tr909.mediaconsult.com (mediacons.tecc.co.uk [193.128.6.132]) by host5.hostingcheck.com (8.9.3/8.9.3) with ESMTP id VAA24164 for <[email protected]>; Tue, 1 Feb 2000 21:57:18 -0500'
      }
    

BUGS

Doesn't use Parse::RecDescent, which it maybe should.

Doesn't offer a `strict RFC822' parsing mode. To implement that would be a royal pain in the arse, unless we move to Parse::RecDescent.

AUTHOR

Adam Spiers <[email protected]>

LICENSE

All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.