Jan Oldřich Krůza > XML-Entities-1.0001 > XML::Entities

Download:
XML-Entities-1.0001.tar.gz

Dependencies

Annotate this POD

Related Modules

HTML::Parser
HTML::Entities
more...
By perlmonks.org

CPAN RT

Open  0
View/Report Bugs
Module Version: 1.0001   Source  

NAME ^

XML::Entities - Decode strings with XML entities

SYNOPSIS ^

 use XML::Entities;

 $a = "Tom & Jerry © Warner Bros.";
 $b = XML::Entities::decode('all', $a);
 $c = XML::Entities::numify('all', $a);
 # now $b is "Tom & Jerry © Warner Bros.
 # and $c is "Tom & Jerry © Warner Bros."

 # void context modifies the arguments
 XML::Entities::numify('all', $a);
 XML::Entities::decode('all', $a, $c);
 # Now $a, $b and $c all contain the decoded string

DESCRIPTION ^

Based upon the HTML::Entities module by Gisle Aas

This module deals with decoding of strings with XML character entities. The module provides two functions:

decode( $entity_set, $string, ... )

This routine replaces XML entities from $entity_set found in the $string with the corresponding Unicode character. Unrecognized entities are left alone.

The $entity_set can either be a name of an entity set - the selection of which can be obtained by XML::Entities::Data::names(), or "all" for a union, or alternatively a hashref which maps entity names (without leading &'s) to the corresponding Unicode characters (or strings).

If multiple strings are provided as argument they are each decoded separately and the same number of strings are returned.

If called in void context the arguments are decoded in-place.

Note: If your version of HTML::Parser was built without Unicode support, then XML::Entities uses a regular expression to do the decoding, which is slower.

numify( $entity_set, $string, ... )

This functions converts named XML entities to numeric XML entities. It is less robust than the decode function in the sense that it doesn't capture improperly terminated entities. It behaves like decode in treating parameters and returning values.

XML::Entities::Data

The list of entities is defined in the XML::Entities::Data module. The list can be generated from the w3.org definition (or any other). Check perldoc XML::Entities::Data for more details.

Encoding entities

The HTML::Entities module provides a function for encoding entities. You just have to assign the right mapping to the %HTML::Entities::char2entity hash. So, to encode everything that XML::Entities knows about, you'd say:

 use XML::Entities;
 use HTML::Entities;
 %HTML::Entities::char2entity = %{
    XML::Entities::Data::char2entity('all');
 };
 my $encoded = encode_entities('tom&jerry');
 # now $encoded is 'tom&jerry'

SEE ALSO ^

HTML::Entities, XML::Entities::Data

COPYRIGHT ^

Copyright 2012 Jan Oldrich Kruza <sixtease@cpan.org>. All rights reserved.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

syntax highlighting: