Alvis::NLPPlatform::Annotation - Perl extension for managing XML annotation of documents in the Alvis format
use Alvis::NLPPlatform::Annotation;
Alvis::NLPPlatform::Annotation::load_xml($doc_xml);
Alvis::NLPPlatform::Annotation::render_xml($doc_xml, \*STDOUT);
This module provides two main methods (load_xml and render_xml) for loading and dumping XML annotated documents conformed to the Alvis DTD (see http://www.alvis/info ).
load_xml
render_xml
Documents are read on the standard input and load in a has table. Annotated documents are written on a file thanks to the descriptor given as parameter. Note that the input documents can be annoted or not, even partially annotated.
read_key_id($element_id);
this method returns the number in the id ($element_id) of the token or word XML element (10 in the element id 'token10').
$element_id
sort_keys($element_id1, $element_id2);
This method sorts two xml element ids ($element_id1 and $element_id2) after removing string refering to the type of the xml element ("token", "word", etc.).
$element_id1
$element_id2
sort($ref_hashtable)
This method sorts elements of the hash table ($ref_hashtable) according to the number in the id ($element_id) of the XML elements (10 in the element id 'token10').
$ref_hashtable
render($doc_hash, $descriptor);
Write the XML document annotation in the specified decriptor ($descriptor). The document is passed as a hashtable ($doc_hash) loaded by the method load_xml. This hashtable can be modified by NLP Wrappers (Alvis::NLPPlatform::NLPWrappers).
$descriptor
$doc_hash
Alvis::NLPPlatform::NLPWrappers
The method return 0 in case of success.
render($doc_hash, $descriptor, $printCollectionHeaderFooter);
Main method used for generating XML document annotations. $descriptor is the decriptor of the file where the document will be stored. $doc_hash is the hashtable containing the annotated document. $printCollectionHeaderFooter indicates if the documentCollection header and footer have to be printed. $hash_config is the reference to the hashtable containing the variables defined in the configuration file).
$printCollectionHeaderFooter
documentCollection
$hash_config
load_xml($doc_xml);
Read a input XML annotated document ($doc_xml) on STDIN. The loaded annotations are stored in a hashtable. This hashtable can be modified by NLP Wrappers (Alvis::NLPPlatform::NLPWrappers).
$doc_xml
print_Annotation($descriptor, $string);
This method prints annotations in the descriptor and insures any conformance, (to UTF-8 for instance).
# =head1 ENVIRONMENT
Alvis::NLPPlatform
Alvis web site: http://www.alvis.info
Thierry Hamon <thierry.hamon@lipn.univ-paris13.fr> and Julien Deriviere <julien.deriviere@lipn.univ-paris13.fr>
Copyright (C) 2005 by Thierry Hamon and Julien Deriviere
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.6 or, at your option, any later version of Perl 5 you may have available.
To install Alvis::NLPPlatform, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Alvis::NLPPlatform
CPAN shell
perl -MCPAN -e shell install Alvis::NLPPlatform
For more information on module installation, please visit the detailed CPAN module installation guide.