NAME

Lingua::LO::NLP::Romanize - Romanize Lao syllables

FUNCTION

This is a factory class for Lingua::LO::NLP::Romanize::*. Currently there are the following romanization modules:

Lingua::LO::NLP::Romanize::PCGN: for the standard set by the Permanent Committee on Geographical Names for British Official Use
Lingua::LO::NLP::Romanize::IPA: for the International Phonetic Alphabet

SYNOPSIS

    my $o = Lingua::LO::NLP::Romanize->new(
        variant => 'PCGN',
        hyphen => 1,
    );

METHODS

new

The constructor takes any number of hash-style named arguments. The following ones are always recognized:

variant: Standard according to which to romanize; this determines the Lingua::LO::NLP::Romanize subclass to actually instantiate. This argument is mandatory.
hyphen: Separate runs of Lao syllables with "hyphens". Set this to the character you would like to use as a hyphen - usually this will be the ASCII "hyphen minus" (U+002D) but it can be the unambiguous Unicode hyphen ("‐", U+2010), a slash or anything you like (except for the special-cased '0' and '1' - but you wouldn't want those between your syllables anyway!). As a special case, you can pass a 1 to use the ASCII version. If this argument is missing, undef or 0, blanks are used. Syllables duplicated using "ໆ" are always joined with a hyphen: either the one you specify or the ASCII one.
normalize: Run text through tone mark order normalization; see "normalize_tone_marks" in Lingua::LO::NLP::Data. If your text looks fine but syllables are not recognized, you may need this.

Subclasses may specify additional arguments, such as IPA's tone that controls the rendering of IPA diacritics for tonal languages.

romanize

    romanize( $text )

Return the romanization of $text according to the standard passed to the constructor. Text is split up by "get_fragments" in Lingua::LO::NLP::Syllabify; Lao syllables are processed and everything else is passed through unchanged save for possible conversion of combining characters to a canonically equivalent form by "NFC" in Unicode::Normalize.

romanize_syllable

    romanize_syllable( $syllable | $analysis )

Return the romanization of a single $syllable according to the standard passed to the constructor. This method accepts either a plain string or an analysis result from Lingua::LO::NLP::Analyze. The latter helps avoid redundant parsing if you need both an analysis and a romanization.

_romanize_syllable

    _romanize_syllable( $analysis )

Return the romanization of a syllable passed in as a 'Lingua::LO::NLP::Analyze' result, according to the standard passed to the constructor. This is a virtual method that must be implemented by subclasses.

hyphen

  my $hyphen = $o->hyphen;
  $o->hyphen( '-' );    # Use ASCII hyphen
  $o->hyphen( 1 );      # Dito
  $o->hyphen( 0 );      # No hyphenation, separate syllables with spaces
  $o->hyphen( '‐' );    # Unicode hyphen U+2010

Accessor for the hyphen attribute, see "new".

normalize

  my $normalization = $o->normalize;
  $o->normalize( $bool );

Accessor for the normalize attribute, see "new".

To install Lingua::LO::NLP, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Lingua::LO::NLP

CPAN shell

perl -MCPAN -e shell
install Lingua::LO::NLP

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)