The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

Changes for version v2.0.10_006 - 2015-09-25

  • renamed distribution to 'Moot' for CPAN-friendliness

Changes for version v2.0.10_005 - 2014-04-16

  • added moot-scan.perl : moot TokenReader debugging
  • updated to 2.0.10_005 to jive with C++ package

Changes for version v2.0.10_004 - 2013-12-18

  • TokenReader.pm,TokenWriter.pm: moot/perl fixes
  • updated to 2.0.10_004 to stay in sync with trunk

Changes for version v2.0.10_003 - 2013-12-06

  • waste: improved handling of negative mode selectors (e.g. -N)
  • updated perl bindings
    • added Moot::Waste::Annotator class
    • updated Moot::TokPP to use Moot::Waste::Annotator

Changes for version v2.0.10_002 - 2013-12-02

  • fixed wasteScanner choking on long utf-8-encoded characters (e.g. U+1D1A3 : MUSICAL SYMBOL ORNAMENT STROKE-9 : \xf0\x9d\x86\xa3 in bach_versuch02_1762
    • wasteScanner should now handle even non-utf8 more or less gracefully

Changes for version v2.0.10_001 - 2013-11-28

  • v2.0.10-1: workaround for probability underflow error propagation in mootHMM::tag_stream()
    • once underflowed, no more differentiation was made, since no nodes qualified as flushable until EOF
    • workaround flushes nodes whenever 'unsafe' probabilities (<-1e37) are encountered
  • encoding tweaks for Moot::TokPP::analyze_buffer()
  • tokpp improvements / fixes
  • fixed to jive with kmw's wasteLexer changes
  • wasteTrainWriter: basically working, but links are being dropped (scanner bug)
  • waste training prototype in testme.perl
  • added Moot::TokPP, moot-tokpp.perl : drop-in replacement for dwds_tomasotath tokenizer-supplied pseudo-morphology
  • documented Waste::Lexer::dehyphenate()
  • make distcheck fixes
  • got Moot::Waste::Decoder working, including buffer-level access
  • added Waste::Decoder to perl
  • Waste::Lexer seems working
    • including get/set on underlying scanner, using lexer->tr_data to hold an SV
  • removed WasteLexerPerl class
    • was WIP for simultaneous support of both standalone and embedded wasteLexicon objects, now abandoned
  • Waste::Lexicon : now only accessible via Waste::Lexer
    • avoids ref-counting madness for embedded objects
  • added TokenReader, TokenWriter hierarchy wrappers
    • WIP on wasteLexer, wasteLexicon
  • wrapped wasteTokenScanner as Moot::Waste::Scanner
  • added scanner,lexer type constants (why? they're not actually _used_ ... we should probably remove them again)
  • wrapper uses PerlIO layer
  • TokenReader bugfixes (check for null tr_istream in from_filename()

Changes for version v2.0.9_002 - 2013-10-22

  • added re2c_ucl.py (re2c char-class generator)
  • added wasteScannerScan.* templates for waste generation
  • added moot(lookup|merge)-(lex|123).perl to MANIFEST
  • added mootlookup-lex.perl
  • fixes for weird DynaLoader bug on perl v5.14 / 32-bit i686 / debian wheezy if CCFLAGS is set in Makefile.PL
    • strangely, x86_64 machine was unaffected
    • bad: Linux plato 3.2.0-4-686-pae #1 SMP Debian 3.2.41-2 i686 GNU/Linux
  • added command-line utils mootmerge*.perl
  • updated version for 2.0.9-2

Changes for version v2.0.9_001 - 2012-03-19

  • updated perl wrappers

Modules

Perl interface to the libmoot part-of-speech tagging library
libmoot : constants
libmoot : HMM
libmoot : HMM : dynamic : lexical probabilites via Boltzmann distribution
libmoot : HMM : dynamic
libmoot : HMM : dynamic : lexical probabilites
libmoot : lexical frequencies
libmoot : n-gram frequencies
libmoot : heuristic token analyzer (pseudo-morphology, wraps for Moot::Waste::Annotator)
libmoot : Token I/O
libmoot: Token I/O: reader
libmoot: Token I/O: reader: native 1 word/line format
libmoot: Token I/O: reader: built-in XML format
libmoot: Token I/O: writer
libmoot: Token I/O: writer: native 1 word/line format
libmoot: Token I/O: writer: built-in XML format
libmoot : WASTE tokenization system
libmoot : WASTE tokenizer : pattern-based token annotator
libmoot : WASTE tokenizer : post-Viterbi decoder
libmoot : WASTE tokenizer : mid-level lexer
libmoot : WASTE tokenizer : simple word-list lexicon
libmoot : WASTE tokenizer : low-level scanner