Jerome Eteve > WebService-ReutersConnect > WebService::ReutersConnect::XMLDocument

Download:
WebService-ReutersConnect-0.06.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Source  

NAME ^

WebService::ReutersConnect::XMLDocument - A decoration of XML::LibXML::Document with extra gizmos

SYNOPSIS ^

This basically acts as an XML::LibXML::Document execpts it has the following extra attributes:

xml_namespaces

Returns a Array Ref list of all XML::LibXML::Namespace included in this document. This is mainly for internal use.

usage:

 foreach my $ns_node ( @{$this->xml_namespaces() ){
    ## Print some stuff.
 }

xml_xpath

A ready to serve instance of <XML::LibXML::XPathContext> with the namespaces preregistered.

NOTE: The default namespace is 'rcx' (rEUTERS cONNECT xML).

Usage:

  print( $this->xml_xpath->findvalue('//rcx::headline') );
  print( $this->xml_xpath->findvalue('//rcx::description') );

get_subjects

Returns an ARRAY of WebService::ReutersConnect::DB::Result::Concept representing the subjects of this reuters news document.

Usage:

  my @subjects = $this->get_subjects();
  foreach my $subject ( @subjects ){
    print $subject->name_main()."\n";
    ...
  }

get_html_body

Get the XML::LibXML::Element that is the HTML Body of this rich document.

In an array context, directly returns the non blank children of the body as an array. This is useful to directly display the body content without outputting the 'body' element again.

Usage:

   if( my $body = $this->get_html_body() ){
     print $body->toString(1);
   }

   if( my @body_parts = $this->get_html_body() ){
     print join("\n" , map{ $_->toString(1) } @body_parts );
   }
syntax highlighting: