jspell-dist - RFC for jspell dictionary packages
The Lingua::Jspell binary format (also known as hash file) is architecture dependent (32 vs 64 bit architectures, little-endian vs big-endian architectures). This makes the release of binary formats for each dictionary unmanageable.
Also, given that some language dictionaries (namely, the Portuguese dictionary) require some developing tools (bison, flex and gcc), distributing the bootstrap files would be also very complicated.
Therefore, this RFC defines a middle-term structure, where just a full Lingua::Jspell installation is needed (together with Perl and some default Lingua::Jspell dependencies).
The files that are usually installed for each dictionary are: affix file, hash file, irregular file (if exists) and meta (yaml) file. All these files are text documents, meaning they are architecture independent.
Regarding the hash file,
it is language dependent but can be built with
that is delivered with Lingua::Jspell.
jbuild requires the affix file (already mentioned above) and the dictionary file.
The dictionary file is,
a textual document,
jbuild works with just one dictionary file,
some dictionaries are split in different,
making the management of the dictionary easier.
Instead of requiring a single dictionary file,
jspell dictionary packages can handle more than one dictionary file that are concatenated together during the build phase.
The suggested structure for a jspell dictionary package is:
just like Perl modules manifest files.
It will be used to check for package completeness.
The meta-data file is an yaml file.
The name can be anything,
given that the file extension is
Note that there should be only one yaml file in the distribution package.
This file should include the
The first element of the list will be the official dictionary name,
used when renaming the package files.
the system will try to link the other language names during installation.
The affix file should have the
there should be only one affix file in the distribution package.
Some languages might include irregular verbs (or other).
They normally result in one or more files with the
They will be concatenated together in a single
sorting filenames alphabetically.
All the files with
.dic extension are supposed to be dictionary files.
They will be concatenated together,
using filenames alphabetical order.
This means that if there are any kind of macros that should be declared earlier,
be sure to include them before any other file.
The package installation process will follow the subsequent steps:
MANIFESTfile is read and the package content files are tested. If any file is missing the installation process will fail.
If no yaml file is present, the system will issue an warning, but will continue trying to use as language the name of the affix file.
.dic) are sorted, and the files concatenated together. The result will be placed on a file with the language name, followed by the
.irrfile, sorting the files, concatenating them, and putting the result in a file with the language name and the
.afffile and the
.dicfile created on step 4.
Alberto Manuel Brandão Simões, <email@example.com>
Copyright (C) 2010 by Alberto Manuel Brandão Simões