VERSION
Version 0.2SYNOPSIS
use Convert::YText qw(encode_ytext decode_ytext);$encoded=encode_ytext($string); $decoded=decode_ytext($encoded);
($decoded eq $string) || die ``this should never happen!'';
DESCRIPTION
Convert::YText converts strings to and from ``YText'', a format inspired by xtext defined in RFC1894, the MIME base64 and quoted-printable types (RFC 1394). The main goal is encode a UTF8 string into something safe for use as the local part in an internet email address (RFC2822).By default spaces are replaced with ``+'', ``/'' with ``~'', the characters ``A-Za-z0-9_.-'' encode as themselves, and everything else is written ``=USTR='' where USTR is the base64 (using ``A-Za-z0-9_.'' as digits) encoding of the unicode character code. The encoding is configurable (see below).
PROCEDURAL INTERFACE
The module can can export "encode_ytext" which converts arbitrary unicode string into a ``safe'' form, and "decode_ytext" which recovers the original text. "validate_ytext" is a heuristic which returns 0 for bad input.OBJECT ORIENTED INTERFACE.
For more control, you will need to use the OO interface.new
Create a new encoding object.Arguments
Arguments are by name (i.e. a hash).
- DIGIT_STRING ("A-Za-z0-9_.") Must be 64 characters long
- ESCAPE_CHAR ('=') Must not be in digit string.
- SPACE_CHAR ('+') Non digit to replace space. Can be the empty string.
- SLASH_CHAR ( '~') Non digit to replace slash. Can be the empty string.
- EXTRA_CHARS ('._\-') Other characters to leave unencoded.
encode
Argumentsa string to encode.
Returns
encoded string
decode
Argumentsa string to decode.
Returns
encoded string
valid
Simple necessary but not sufficient test for validity.DISCUSSION
According to RFC 2822, the following non-alphanumerics are OK for the local part of an address: ``!#$%&'*+-/=?^_`{|}~''. On the other hand, it seems common in practice to block addresses having ``%!/|`#&?'' in the local part. The idea is to restrict ourselves to basic ASCII alphanumerics, plus a small set of printable ASCII, namely ``=_+-~.''.The characters '+' and '-' are pretty widely used to attach suffixes (although usually only one works on a given mail host). It seems ok to use '+-', since the first marks the beginning of a suffix, and then is a regular character. The character '.' also seems mostly permissable.
AUTHOR
David Bremner, <[email protected]<gt>COPYRIGHT
Copyright (C) 2011 David Bremner. All Rights Reserved.This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.