The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

WebService::ReutersConnect::XMLDocument - A decoration of XML::LibXML::Document with extra gizmos

SYNOPSIS

This basically acts as an XML::LibXML::Document execpts it has the following extra attributes:

xml_namespaces

Returns a Array Ref list of all XML::LibXML::Namespace included in this document. This is mainly for internal use.

usage:

 foreach my $ns_node ( @{$this->xml_namespaces() ){
    ## Print some stuff.
 }

xml_xpath

A ready to serve instance of <XML::LibXML::XPathContext> with the namespaces preregistered.

NOTE: The default namespace is 'rcx' (rEUTERS cONNECT xML).

Usage:

  print( $this->xml_xpath->findvalue('//rcx::headline') );
  print( $this->xml_xpath->findvalue('//rcx::description') );

get_subjects

Returns an ARRAY of WebService::ReutersConnect::DB::Result::Concept representing the subjects of this reuters news document.

Usage:

  my @subjects = $this->get_subjects();
  foreach my $subject ( @subjects ){
    print $subject->name_main()."\n";
    ...
  }

get_html_body

Get the XML::LibXML::Element that is the HTML Body of this rich document.

In an array context, directly returns the non blank children of the body as an array. This is useful to directly display the body content without outputting the 'body' element again.

Usage:

   if( my $body = $this->get_html_body() ){
     print $body->toString(1);
   }

   if( my @body_parts = $this->get_html_body() ){
     print join("\n" , map{ $_->toString(1) } @body_parts );
   }