YOU Hyun Jo > Encode-Escape-0.14 > Encode::Escape::Unicode

Download:
Encode-Escape-0.14.tar.gz

Dependencies

Annotate this POD

CPAN RT

New  3
Open  0
View/Report Bugs
Module Version: 0.13   Source  

NAME ^

Encode::Escape::Unicode - Perl extension for Encoding of Unicode Escape Sequnces

SYNOPSIS ^

  use Encode::Escape::Unicode;

  $escaped = "What is \\x{D384}? It's Perl!";
  $string = decode 'unicode-escape', $escaped;

  # Now, $string is equivalent "What is \x{D384}? It's Perl!"

  Encode::Escape::Unicode->demode('python');

  $python_unicode_escape = "And \\u041f\\u0435\\u0440\\u043b? It's Perl, too.";
  $string = decode 'unicode-escape', $python_unicode_escape;

  # Now, $string eq "And \x{041F}\x{0435}\x{0440}\x{043B}? It's Perl, too."

If you have a text data file 'unicode-escape.txt'. It contains a line:

  What is \x{D384}? It's Perl!\n
  And \x{041F}\x{0435}\x{0440}\x{043B}? It's Perl, too.\n

And you want to use it as if it were a normal double quote string in source code. Try this:

  use Encode::Escape::Unicode;

  open(FILE, 'unicode-escape.txt');

  while(<FILE>) {
    chomp;
    print encode 'utf8', decode 'unicode-escape', $_;
  }

DESCRIPTION ^

Encode::Escape::Unicode module implements encodings of escape sequences.

Simply saying, it converts (decodes) escape sequences into Perl internal string (\x{0000} -- \x{ffff}) and encodes Perl strings to escape sequences.

MODES AND SUPPORTED ESCAPE SEQUENCES

default or perl mode

 Escape Sequcnes      Description
 ---------------      --------------------------
 \a                   Alarm (beep)
 \b                   Backspace
 \e                   Escape
 \f                   Formfeed
 \n                   Newline
 \r                   Carriage return
 \t                   Tab
 \000     - \377      octal ASCII value. \0, \00, and \000 are equivalent.
 \x00     - \xff      hexadecimal ASCII value. \x0 and \x00 are equivalent.
 \x{0000} - \x{ffff}  hexadecimal ASCII value. \x{0}, \x{00}, x\{000}, \x{0000}


 \\                   Backslash
 \$                   Dollar Sign
 \@                   Ampersand
 \"                   Print double quotes
 \                    Escape next character if known otherwise print

This is the default mode. You don't need to invoke it since you haven't invoke other mode previously.

python or java mode

Python, Java, and C# languages use \uxxxx escape sequence for Unicode character.

 Escape Sequcnes      Description
 ---------------      --------------------------
 \a                   Alarm (beep)
 \b                   Backspace
 \e                   Escape
 \f                   Formfeed
 \n                   Newline
 \r                   Carriage return
 \t                   Tab
 \000   - \377        octal ASCII value. \0, \00, and \000 are equivalent.
 \x00   - \xff        hexadecimal ASCII value. \x0 and \x00 are equivalent.
 \u0000 - \uffff      hexadecimal ASCII value.

 \\                   Backslash
 \$                   Dollar Sign
 \@                   Ampersand
 \"                   Print double quotes
 \                    Escape next character if known otherwise print

If you have data which contains \uxxxx escape sequences, this will translate them to utf8-encoded characters:

 use Encode::Escape;

 Encode::Escape::demode 'unicode-escape', 'python';

 while(<>) {
        chomp;
        print encode 'utf8', decode 'unicode-escape', $_;
 }

And this will translate \uxxxx to \x{xxxx}.

 use Encode::Escape;

 Encode::Escape::enmode 'unicode-escape', 'perl';
 Encode::Escape::demode 'unicode-escape', 'python';

 while(<>) {
        chomp;
        print encode 'unicode-escape', decode 'unicode-escape', $_;
 }

SEEALSO ^

See Encode::Escape.

AUTHOR ^

you, <you at cpan dot org>

COPYRIGHT AND LICENSE ^

Copyright (C) 2007 by you

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.

syntax highlighting: