XML::MyXML::II - A simple-to-use XML module, for parsing and creating XML documents
version 0.1002
use XML::MyXML::II qw(tidy_xml xml_to_object); use XML::MyXML::II qw(:all); my $xml = "<item><name>Table</name><price><usd>10.00</usd><eur>8.50</eur></price></item>"; print tidy_xml($xml); my $obj = xml_to_object($xml); print "Price in Euros = " . $obj->path('price/eur')->value; $obj->simplify is hashref { item => { name => 'Table', price => { usd => '10.00', eur => '8.50' } } } $obj->simplify({ internal => 1 }) is hashref { name => 'Table', price => { usd => '10.00', eur => '8.50' } }
tidy_xml, xml_to_object, object_to_xml, simple_to_xml, xml_to_simple, check_xml
XML::MyXML::II is similar to XML::MyXML, but introduced some changes in v0.100 that broke backwards compatibility with programs that use XML::MyXML. For this reason, XML::MyXML has been kept in its place, but will not be maintained anymore. XML::MyXML::II is the module you should use (and will be maintained). Its differences from the older XML::MyXML are the following:
Better handling of unicode: Only XML documents (both parameters and returned values) are in bytes/octets. All other strings contain characters rather than bytes/octets (see section "FEATURES & LIMITATIONS"). Removed the utf8 option from the functions & methods that used it.
utf8
Objects created by this module will now be automatically destroyed once out of scope (because I replaced cycles with weakened refs)
Removed the pretty useless soft option from the functions and methods that used it. If you want XML::MyXML's soft behaviour, you will need to eval {} from your own program.
soft
eval {}
$obj->tag doesn't by default strip the namespace from the returned tagname
$obj->tag
This module can parse XML comments, CDATA sections, XML entities (the standard five and numeric ones) and simple non-recursive <!ENTITY>s
<!ENTITY>
It will ignore (won't parse) <!DOCTYPE...>, <?...?> and other <!...> special markup
<!DOCTYPE...>
<?...?>
<!...>
XML documents passed as parameters to this module's functions must be strings containing bytes/octets, rather than contain characters. They also must be UTF-8 encoded unless an encoding is declared in the initial XML declaration <?xml ... ?> of the document. All XML documents produced by this module will be UTF-8 encoded (bytes/octets). However all other strings which are output by this module's functions and methods (and which are not XML documents) will contain characters rather than bytes/octets.
XML documents to be parsed may not contain the > character unencoded in attribute values
>
Some functions and methods in this module accept optional flags, listed under each function in the documentation. They are optional, default to zero unless stated otherwise, and can be used as follows: function_name( $param1, { flag1 => 1, flag2 => 1 } ). This is what each flag does:
function_name( $param1, { flag1 => 1, flag2 => 1 } )
strip : the function will strip initial and ending whitespace from all text values returned
strip
file : the function will expect the path to a file containing an XML document to parse, instead of an XML string
file
complete : the function's XML output will include an XML declaration (<?xml ... ?>) in the beginning
complete
<?xml ... ?>
internal : the function will only return the contents of an element in a hashref instead of the element itself (see "SYNOPSIS" for example)
internal
tidy : the function will return tidy XML
tidy
indentstring : when producing tidy XML, this denotes the string with which child elements will be indented (Default is the 'tab' character)
indentstring
save : the function (apart from doing what it's supposed to do) will also save its XML output in a file whose path is denoted by this flag
save
strip_ns : strip the namespaces (characters up to and including ':') from the tags
strip_ns
xslt : will add a <?xml-stylesheet?> link in the XML that's being output, of type 'text/xsl', pointing to the filename or URL denoted by this flag
xslt
arrayref : the function will create a simple arrayref instead of a simple hashref (which will preserve order and elements with duplicate tags)
arrayref
Returns the same string, but with the <, >, &, " and ' characters replaced by their XML entities (e.g. &).
<
&
"
'
&
Returns the XML string in a tidy format (with tabs & newlines)
Optional flags: file, complete, indentstring, save
Creates an 'XML::MyXML::II::Object' object from the raw XML provided
Optional flags: file
Creates an XML string from the 'XML::MyXML::II::Object' object provided
Optional flags: complete, tidy, indentstring, save
Produces a raw XML string from either an array reference, a hash reference or a mixed structure such as these examples:
{ thing => { name => 'John', location => { city => 'New York', country => 'U.S.A.' } } } [ thing => [ name => 'John', location => [ city => 'New York', country => 'U.S.A.' ] ] ] { thing => { name => 'John', location => [ city => 'New York', city => 'Boston', country => 'U.S.A.' ] } }
All the strings in $simple_array_ref need to contain characters, rather than bytes/octets. The XML output of this function however will be a UTF-8 encoded string (i.e. will contain bytes/octets).
$simple_array_ref
Optional flags: complete, tidy, indentstring, save, xslt
Produces a very simple hash object from the raw XML string provided. An example hash object created thusly is this: { thing => { name => 'John', location => { city => 'New York', country => 'U.S.A.' } } }
{ thing => { name => 'John', location => { city => 'New York', country => 'U.S.A.' } } }
Since the object created is a hashref, duplicate keys will be discarded. WARNING: This function only works on very simple XML strings, i.e. children of an element may not consist of both text and elements (child elements will be discarded in that case)
All strings contained in the output simple structure, will contain characters rather than octets/bytes.
Optional flags: internal, strip, file, strip_ns, arrayref
Returns true if the $raw_xml string is valid XML (valid enough to be used by this module), and false otherwise.
Returns the element specified by the path as an XML::MyXML::II::Object object. When there are more than one tags with the specified name in the last step of the path, it will return all of them as an array. In scalar context will only return the first one. CSS3-style attribute selectors are allowed in the path next to the tagnames, for example: p[class=big] will only return <p> elements that contain an attribute called "class" with a value of "big". p[class] on the other hand will return p elements having a "class" attribute, but that attribute can have any value.
p[class=big]
<p>
When the element represented by the $obj object has only text contents, returns those contents as a string. If the $obj element has no contents, value will return an empty string.
Optional flags: strip
Gets/Sets the value of the 'attrname' attribute of the top element. Returns undef if attribute does not exist. If called without the 'attrname' paramter, returns a hash with all attribute => value pairs. If setting with an attrvalue of undef, then removes that attribute entirely.
undef
Input parameters and output are all in character strings, rather than octets/bytes.
Optional flags: none
Returns the tag of the $obj element. E.g. if $obj represents an <rss:item> element, $obj->tag will return the string 'rss:item'. Returns undef if $obj doesn't represent a tag.
Optional flags: strip_ns
Returns a very simple hashref, like the one returned with &XML::MyXML::II::xml_to_simple. Same restrictions and warnings apply.
&XML::MyXML::II::xml_to_simple
Optional flags: internal, strip, strip_ns, arrayref
Returns the XML string of the object, just like calling object_to_xml( $obj )
object_to_xml( $obj )
Returns the XML string of the object in tidy form, just like calling tidy_xml( object_to_xml( $obj ) )
tidy_xml( object_to_xml( $obj ) )
Optional flags: complete, indentstring, save
If you have a Github account, report your issues at https://github.com/akarelas/xml-myxml/issues. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
Alexander Karelas <karjala@cpan.org>
This software is copyright (c) 2013 by Alexander Karelas.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install XML::MyXML, copy and paste the appropriate command in to your terminal.
cpanm
cpanm XML::MyXML
CPAN shell
perl -MCPAN -e shell install XML::MyXML
For more information on module installation, please visit the detailed CPAN module installation guide.