SYNOPSIS
 use Jcode;
 # 
 # traditional
 Jcode::convert(\$str, $ocode, $icode, "z");
 # or OOP!
 print Jcode->new($str)->h2z->tr($from, $to)->utf8;
DESCRIPTION
<Japanese document is now available as Jcode::Nihongo. >Jcode.pm supports both object and traditional approach. With object approach, you can go like;
$iso_2022_jp = Jcode->new($str)->h2z->jis;
Which is more elegant than:
$iso_2022_jp = $str; &jcode::convert(\$iso_2022_jp, 'jis', &jcode::getcode(\$str), "z");
For those unfamiliar with objects, Jcode.pm still supports "getcode()" and "convert()."
If the perl version is 5.8.1, Jcode acts as a wrapper to Encode, the standard charset handler module for Perl 5.8 or later.
Methods
Methods mentioned here all return Jcode object unless otherwise mentioned.Constructors
- $j = Jcode->new($str [, $icode])
- 
Creates Jcode object $j from $str.  Input code is automatically checked 
unless you explicitly set $icode. For available charset, see getcode
below.
For perl 5.8.1 or better, $icode can be any encoding name that Encode understands. $j = Jcode->new($european, 'iso-latin1'); When the object is stringified, it returns the EUC-converted string so you can <print $j> instead of <print $j->euc>. 
- $j->set($str [, $icode])
- 
Sets $j's internal string to $str.  Handy when you use Jcode object repeatedly 
(saves time and memory to create object). 
# converts mailbox to SJIS format my $jconv = new Jcode; $/ = 00; while(<>){ print $jconv->set(\$_)->mime_decode->sjis; }
- $j->append($str [, $icode]);
- Appends $str to $j's internal string.
- $j = jcode($str [, $icode]);
- 
shortcut for Jcode->new() so you can go like;
 
Encoded Strings
In general, you can retrieve encoded string as $j->encoded.
- $sjis = jcode($str)->sjis
- $euc = $j->euc
- $jis = $j->jis
- $sjis = $j->sjis
- $ucs2 = $j->ucs2
- $utf8 = $j->utf8
- What you code is what you get :)
- $iso_2022_jp = $j->iso_2022_jp
- 
Same as "$j->h2z->jis".
Hankaku Kanas are forcibly converted to Zenkaku.
For perl 5.8.1 and better, you can also use any encoding names and aliases that Encode supports. For example: $european = $j->iso_latin1; # replace '-' with '_' for names. FYI: Encode::Encoder uses similar trick. - 
- $j->fallback($fallback)
- 
For perl is 5.8.1 or better, Jcode stores the internal string in
UTF-8.  Any character that does not map to ->encoding are
replaced with a '?', which is Encode standard.
my $unistr = "\x{262f}"; # YIN YANG my $j = jcode($unistr); # $j->euc is '?'You can change this behavior by specifying fallback like Encode. Values are the same as Encode. "Jcode::FB_PERLQQ", "Jcode::FB_XMLCREF", "Jcode::FB_HTMLCREF" are aliased to those of Encode for convenice. print $j->fallback(Jcode::FB_PERLQQ)->euc; # '\x{262f}' print $j->fallback(Jcode::FB_XMLCREF)->euc; # '☯' print $j->fallback(Jcode::FB_HTMLCREF)->euc; # '☯'The global variable $Jcode::FALLBACK stores the default fallback so you can override that by assigning the value. $Jcode::FALLBACK = Jcode::FB_PERLQQ; # set default fallback scheme 
 
 
- 
- [@lines =] $jcode->jfold([$width, $newline_str, $kref])- folds lines in jcode string every $width(default: 72) where $width is the number of ``halfwidth'' character. Fullwidth Characters are counted as two.
with a newline string spefied by $newline_str (default: ``\n''). Rudimentary kinsoku suppport is now available for Perl 5.8.1 and better. 
- $length = $jcode->jlength();
- returns character length properly, rather than byte length.
 
Methods that use MIME::Base64
To use methods below, you need MIME::Base64. To install, simply
   perl -MCPAN -e 'CPAN::Shell->install("MIME::Base64")'
If your perl is 5.6 or better, there is no need since MIME::Base64 is bundled.
- $mime_header = $j->mime_encode([$lf, $bpl])- Converts $strto MIME-Header documented in RFC1522. When $lf is specified, it uses $lf to fold line (default: \n). When $bpl is specified, it uses $bpl for the number of bytes (default: 76; this number must be smaller than 76).
For Perl 5.8.1 or better, you can also encode MIME Header as: $mime_header = $j->MIME_Header; In which case the resulting $mime_header is MIME-B-encoded UTF-8 whereas "$j->mime_encode()" returnes MIME-B-encoded ISO-2022-JP. Most modern MUAs support both. 
- $j->mime_decode;
- 
Decodes MIME-Header in Jcode object.  For perl 5.8.1 or better, you
can also do the same as:
Jcode->new($str, 'MIME-Header') 
 
Hankaku vs. Zenkaku
- $j->h2z([$keep_dakuten])
- 
Converts X201 kana (Hankaku) to X208 kana (Zenkaku).  
When $keep_dakuten is set, it leaves dakuten as is
(That is, ``ka + dakuten'' is left as is instead of
being converted to ``ga'')
You can retrieve the number of matches via $j->nmatch; 
- $j->z2h
- 
Converts X208 kana (Zenkaku) to X201 kana (Hankaku).
You can retrieve the number of matches via $j->nmatch; 
 
Regexp emulators
To use "->m()" and "->s()", you need perl 5.8.1 or better.
- $j->tr($from, $to, $opt);- Applies "tr/$from/$to/"on Jcode object where $from and $to are EUC-JP strings. On perl 5.8.1 or better, $from and $to can also be flagged UTF-8 strings.
If $opt is set, "tr/$from/$to/$opt" is applied. $opt must be 'c', 'd' or the combination thereof. You can retrieve the number of matches via $j->nmatch; The following methods are available only for perl 5.8.1 or better. 
- $j->s($patter, $replace, $opt);
- Applies "s/$pattern/$replace/$opt". $pattern and "replace" must be in EUC-JP or flagged UTF-8. $opt are the same as regexp options. See perlre for regexp options.
Like "$j->tr()", "$j->s()" returns the object itself so you can nest the operation as follows; $j->tr("a-z", "A-Z")->s("foo", "bar");
- [@match = ] $j->m($pattern, $opt);
- Applies "m/$patter/$opt". Note that this method DOES NOT RETURN AN OBJECT so you can't chain the method like "$j->s()".
 
Instance Variables
If you need to access instance variables of Jcode object, use access methods below instead of directly accessing them (That's what OOP is all about)
FYI, Jcode uses a ref to array instead of ref to hash (common way) to optimize speed (Actually you don't have to know as long as you use access methods instead; Once again, that's OOP)
Subroutines
- ($code, [$nmatch]) = getcode($str)
- 
Returns char code of $str. Return codes are as follows
ascii Ascii (Contains no Japanese Code) binary Binary (Not Text File) euc EUC-JP sjis SHIFT_JIS jis JIS (ISO-2022-JP) ucs2 UCS2 (Raw Unicode) utf8 UTF8 When array context is used instead of scaler, it also returns how many character codes are found. As mentioned above, $str can be \$str instead. jcode.pl Users: This function is 100% upper-conpatible with jcode::getcode() --- well, almost; * When its return value is an array, the order is the opposite; jcode::getcode() returns $nmatch first. * jcode::getcode() returns 'undef' when the number of EUC characters is equal to that of SJIS. Jcode::getcode() returns EUC. for Jcode.pm there is no in-betweens. 
- Jcode::convert($str, [$ocode, $icode, $opt])- Converts $strto char code specified by $ocode. When $icode is specified also, it assumes $icode for input string instead of the one checked by getcode(). As mentioned above, $str can be \$str instead.
jcode.pl Users: This function is 100% upper-conpatible with jcode::convert() ! 
BUGS
For perl is 5.8.1 or later, Jcode acts as a wrapper to Encode. Meaning Jcode is subject to bugs therein.ACKNOWLEDGEMENTS
This package owes a lot in motivation, design, and code, to the jcode.pl for Perl4 by Kazumasa Utashiro <[email protected]>.Hiroki Ohzaki <[email protected]> has helped me polish regexp from the very first stage of development.
JEncode by [email protected] has inspired me to integrate Encode to Jcode. He has also contributed Japanese POD.
And folks at Jcode Mailing list <[email protected]>. Without them, I couldn't have coded this far.

