The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
NAME

    utf8::all - turn on Unicode - all of it

VERSION

    version 0.024

SYNOPSIS

        use utf8::all;                      # Turn on UTF-8, all of it.
    
        open my $in, '<', 'contains-utf8';  # UTF-8 already turned on here
        print length 'føø bār';             # 7 UTF-8 characters
        my $utf8_arg = shift @ARGV;         # @ARGV is UTF-8 too (only for main)

DESCRIPTION

    The use utf8 pragma tells the Perl parser to allow UTF-8 in the program
    text in the current lexical scope. This also means that you can now use
    literal Unicode characters as part of strings, variable names, and
    regular expressions.

    utf8::all goes further:

      * charnames are imported so \N{...} sequences can be used to compile
      Unicode characters based on names.

      * On Perl v5.11.0 or higher, the use feature 'unicode_strings' is
      enabled.

      * use feature fc and use feature unicode_eval are enabled on Perl
      5.16.0 and higher.

      * Filehandles are opened with UTF-8 encoding turned on by default
      (including STDIN, STDOUT, and STDERR when utf8::all is used from the
      main package). Meaning that they automatically convert UTF-8 octets
      to characters and vice versa. If you don't want UTF-8 for a
      particular filehandle, you'll have to set binmode $filehandle.

      * @ARGV gets converted from UTF-8 octets to Unicode characters (when
      utf8::all is used from the main package). This is similar to the
      behaviour of the -CA perl command-line switch (see perlrun).

      * readdir, readlink, readpipe (including the qx// and backtick
      operators), and glob (including the <> operator) now all work with
      and return Unicode characters instead of (UTF-8) octets (again only
      when utf8::all is used from the main package).

 Lexical Scope

    The pragma is lexically-scoped, so you can do the following if you had
    some reason to:

        {
            use utf8::all;
            open my $out, '>', 'outfile';
            my $utf8_str = 'føø bār';
            print length $utf8_str, "\n"; # 7
            print $out $utf8_str;         # out as utf8
        }
        open my $in, '<', 'outfile';      # in as raw
        my $text = do { local $/; <$in>};
        print length $text, "\n";         # 10, not 7!

    Instead of lexical scoping, you can also use no utf8::all to turn off
    the effects.

    Note that the effect on @ARGV and the STDIN, STDOUT, and STDERR file
    handles is always global and can not be undone!

 Enabling/Disabling Global Features

    As described above, the default behaviour of utf8::all is to convert
    @ARGV and to open the STDIN, STDOUT, and STDERR file handles with UTF-8
    encoding, and override the readlink and readdir functions and glob
    operators when utf8::all is used from the main package.

    If you want to disable these features even when utf8::all is used from
    the main package, add the option NO-GLOBAL (or LEXICAL-ONLY) to the use
    line. E.g.:

        use utf8::all 'NO-GLOBAL';

    If on the other hand you want to enable these global effects even when
    utf8::all was used from another package than main, use the option
    GLOBAL on the use line:

        use utf8::all 'GLOBAL';

 UTF-8 Errors

    utf8::all will handle invalid code points (i.e., utf-8 that does not
    map to a valid unicode "character"), as a fatal error.

    For glob, readdir, and readlink, one can change this behaviour by
    setting the attribute "$utf8::all::UTF8_CHECK".

ATTRIBUTES

 $utf8::all::UTF8_CHECK

    By default utf8::all marks decoding errors as fatal (default value for
    this setting is Encode::FB_CROAK). If you want, you can change this by
    setting $utf8::all::UTF8_CHECK. The value Encode::FB_WARN reports the
    encoding errors as warnings, and Encode::FB_DEFAULT will completely
    ignore them. Please see Encode for details. Note: Encode::LEAVE_SRC is
    always enforced.

    Important: Only controls the handling of decoding errors in glob,
    readdir, and readlink.

INTERACTION WITH AUTODIE

    If you use autodie, which is a great idea, you need to use at least
    version 2.12, released on June 26, 2012
    <https://metacpan.org/source/PJF/autodie-2.12/Changes#L3>. Otherwise,
    autodie obliterates the IO layers set by the open pragma. See RT #54777
    <https://rt.cpan.org/Ticket/Display.html?id=54777> and GH #7
    <https://github.com/doherty/utf8-all/issues/7>.

BUGS

    Please report any bugs or feature requests on the bugtracker website
    <https://github.com/doherty/utf8-all/issues>.

    When submitting a bug or request, please include a test-file or a patch
    to an existing test-file that illustrates the bug or desired feature.

COMPATIBILITY

    The filesystems of Dos, Windows, and OS/2 do not (fully) support UTF-8.
    The readlink and readdir functions and glob operators will therefore
    not be replaced on these systems.

SEE ALSO

      * File::Find::utf8 for fully utf-8 aware File::Find functions.

      * Cwd::utf8 for fully utf-8 aware Cwd functions.

AUTHORS

      * Michael Schwern <mschwern@cpan.org>

      * Mike Doherty <doherty@cpan.org>

      * Hayo Baan <info@hayobaan.com>

COPYRIGHT AND LICENSE

    This software is copyright (c) 2009 by Michael Schwern
    <mschwern@cpan.org>; he originated it.

    This is free software; you can redistribute it and/or modify it under
    the same terms as the Perl 5 programming language system itself.