Lingua::Align::Corpus - reading corpus data
Read corpus data in various formats. Default format = plain text, 1 sentence per line. For other types (parsed corpora etc): Use the -type flag.
-type
use Lingua::Align::Corpus; my $corpus = new Lingua::Align::Corpus(-file => $corpusfile); my @words=(); while ($corpus->next_sentence(\@words)){ print "\n",$corpus->current_id,"> "; print $treebank->print_sentence(\%tree); } my $treebank = new Lingua::Align::Corpus(-file => $corpusfile, -type => 'TigerXML'); my %tree=(); while ($treebank->next_sentence(\%tree)){ print $treebank->print_sentence(\%tree); print "\n"; }
Joerg Tiedemann, <jorg.tiedemann@lingfil.uu.se>
Copyright (C) 2009 by Joerg Tiedemann
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.
To install Lingua::Align, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Lingua::Align
CPAN shell
perl -MCPAN -e shell install Lingua::Align
For more information on module installation, please visit the detailed CPAN module installation guide.