Johan Vromans > OpenOffice-Wordlist-0.04 > OpenOffice::Wordlist

Download:
OpenOffice-Wordlist-0.04.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 0.04   Source  

NAME ^

OpenOffice::Wordlist - Read/write OpenOffice.org wordlists

SYNOPSIS ^

This module allows reading and writing of OpenOffice.org wordlists (dictionaries).

For example:

  use OpenOffice::Wordlist;

  my $dict = OpenOffice::Wordlist->new;
  $dict->read(".openoffice.org/3/user/wordlist/standard.dic");

  # Print all words.
  foreach my $word ( @{ $dict->words } ) {
      print $word, "\n";
  }

  # Add some words.
  $dict->append( "openoffice", "great" );

  # Write a new dictionary.
  $dict->write("new.dic");

When used as a program this module will read all dictionaries given on the command line and write the resultant list of words to standard output. For example,

  $ perl OpenOffice/Wordlist.pm standard.dic

METHODS ^

$dict = new( [ type => 'WDSWG6', language => 2057, neg => 0 ] )

Creates a new dict object.

Optional arguments:

type => 'WBSWG6' or 'WBSWG2' or 'WBSWG5'.

'WBSWG6' (default) indicates a UTF-8 encoded dictionary, the others indicate a ISO-8859.1 encoded dictionary.

language => code

The code for the language. I assume there's an extensive list of these codes somewhere. Some values determined experimentally:

    255   All
   1031   German (Germany)
   1036   French (France)
   1043   Dutch (Netherlands)
   2047   English UK
   2057   English USA

neg => 0 or 1

Whether the dictionary contains exceptions (neg = 1) or regular words (neg = 0).

If language and neg are not specified they are taken from the first file read, if any.

$dict->read( $file )

Reads the contents of the indicated file.

$dict->append( @words )

Append a list of words to the dictionary. To avoid unpleasant surprises, the words must be encoded in Perl's internal encoding.

The arguments may be constant strings or references to lists of strings.

$dict->words

Returns a reference to the list of words in the dictionary,

The words are encoded in Perl's internal encoding.

$dict->write( $file [ , $type ] )

Writes the contents of the object to a new dictionary.

Arguments: The name of the file to be written, and (optionally) the type of the file to be written (one of 'WBSWG6', 'WBSWG5', 'WBSWG2') overriding the type of the dictionary as establised at create time.

EXAMPLE ^

This example reads all dictionaries that are supplied on the command file, merges them, and writes a new dictionary.

  my $dict = OpenOffice::Wordlist->new( type => 'WBSWG6' );
  $dict->read( shift );
  foreach ( @ARGV ) {
    my $extra = OpenOffice::Wordlist->new->read($_);
    $dict->append( $extra->words );
  }
  $dict->write("new.dic");

Settings like the language and exceptions are copied from the file that is initially read.

AUTHOR ^

Johan Vromans, <jv at cpan.org>

BUGS ^

There's currently no checking done on dictionary types arguments.

Please report any bugs or feature requests to bug-openoffice-wordlist at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=OpenOffice-Wordlist. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT ^

You can find documentation for this module with the perldoc command.

    perldoc OpenOffice::Wordlist

You can also look for information at:

ACKNOWLEDGEMENTS ^

COPYRIGHT & LICENSE ^

Copyright 2010 Johan Vromans, all rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

syntax highlighting: