Convert::Transcribe - Perl extension for transcribing natural languages
use Convert::Transcribe; $t = new Convert::Transcribe(); $t->fromfile('filename'); # or $t = new Convert::Transcribe(); $t->fromstring("transcription def. containing newlines"); # or $t = new Convert::Transcribe('filename'); # or $t = new Convert::Transcribe("transcription def. containing newlines"); $t->transcribe("text"); $t->generated_code(); # for debugging
Transcriptions are transformations of a text from one alphabet into another in a way which feels natural to humans.
This module allows you to specify transcriptions in a notation which hopefully feels more natural than using Perl regexps.
Transcription files look as follows:
# a comment a b > a # 'a' -> 'b' if followed by 'a' a c > ! b # 'a' -> 'c' if not followed by 'b' a d < b # 'a' -> 'd' if text transcribed ends in 'b' a e < ! b # 'a' -> 'e' if text transcribed doesn't end in 'b' a f < $ > $ # 'a' -> 'f' if followed by a word boundary and the # text transcribed ends in a word boundary a g # 'a' -> 'g' otherwise
Transcription files can be loaded from text strings or from files.
The module converts your transcription file into some Perl code which is then eval'ed when you call transcribe(). You may inspect the code generated by calling generated_code().
For transliteration (i.e., one-to-one mapping) you might prefer Convert::Translit by Genji Schmeder (on CPAN).
There probably are a good number of bugs left. Please report!
It would be nice to supply a good number of real-life transcription definitions with the module. Please contribute!
Thomas M. Widmann, <firstname.lastname@example.org>
Copyright 2002, 2003 by Thomas M. Widmann
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.