The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Alvis::NLPPlatform::Annotation - Perl extension for managing XML annotation of documents in the Alvis format

SYNOPSIS

use Alvis::NLPPlatform::Annotation;

Alvis::NLPPlatform::Annotation::load_xml($doc_xml);

Alvis::NLPPlatform::Annotation::render_xml($doc_xml, \*STDOUT);

DESCRIPTION

This module provides two main methods (load_xml and render_xml) for loading and dumping XML annotated documents conformed to the Alvis DTD (see http://www.alvis/info ).

Documents are read on the standard input and load in a has table. Annotated documents are written on a file thanks to the descriptor given as parameter. Note that the input documents can be annoted or not, even partially annotated.

METHODS

read_key_id()

    read_key_id($element_id);

this method returns the number in the id ($element_id) of the token or word XML element (10 in the element id 'token10').

sort_keys()

    sort_keys($element_id1, $element_id2);

This method sorts two xml element ids ($element_id1 and $element_id2) after removing string refering to the type of the xml element ("token", "word", etc.).

sort()

    sort($ref_hashtable)

This method sorts elements of the hash table ($ref_hashtable) according to the number in the id ($element_id) of the XML elements (10 in the element id 'token10').

render()

    render($doc_hash, $descriptor);

Write the XML document annotation in the specified decriptor ($descriptor). The document is passed as a hashtable ($doc_hash) loaded by the method load_xml. This hashtable can be modified by NLP Wrappers (Alvis::NLPPlatform::NLPWrappers).

The method return 0 in case of success.

render_xml()

    render($doc_hash, $descriptor, $printCollectionHeaderFooter);

Main method used for generating XML document annotations. $descriptor is the decriptor of the file where the document will be stored. $doc_hash is the hashtable containing the annotated document. $printCollectionHeaderFooter indicates if the documentCollection header and footer have to be printed. $hash_config is the reference to the hashtable containing the variables defined in the configuration file).

The method return 0 in case of success.

load_xml()

    load_xml($doc_xml);

Read a input XML annotated document ($doc_xml) on STDIN. The loaded annotations are stored in a hashtable. This hashtable can be modified by NLP Wrappers (Alvis::NLPPlatform::NLPWrappers).

The method return 0 in case of success.

    print_Annotation($descriptor, $string);

This method prints annotations in the descriptor and insures any conformance, (to UTF-8 for instance).

# =head1 ENVIRONMENT

SEE ALSO

Alvis::NLPPlatform

Alvis web site: http://www.alvis.info

AUTHORS

Thierry Hamon <thierry.hamon@lipn.univ-paris13.fr> and Julien Deriviere <julien.deriviere@lipn.univ-paris13.fr>

LICENSE

Copyright (C) 2005 by Thierry Hamon and Julien Deriviere

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.6 or, at your option, any later version of Perl 5 you may have available.