The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Lingua::Phonology::Syllable;

SYNOPSIS

        use Lingua::Phonology;
        use Lingua::Phonology::Syllable;

        # Create a new Syllable object
        $syll = new Lingua::Phonology::Syllable;

        # Create an input word
        @word = $phono->symbol->segment('t','a','k','r','o','t');

        # Allow onset clusters and simple codas
        $syll->set_complex_onset;
        $syll->set_coda;

        # Syllabify the word
        $syll->syllabify(@word);

        # @word now has features set to indicate a syllabification of
        # <ta><krot>

DESCRIPTION

Syllabifies an input word of Lingua::Phonology::Segment objects according to a set of parameters. The parameters used are well-known linguistic parameters, so most kinds of syllabification can be handled in just a few lines of code by setting the appropriate values.

This module uses a special set of features to indicate syllabification. These features are added to the feature set of the input segments. The features added are arranged in a heirarchy as follows:

        SYLL           scalar     Non-zero if the segment has been syllabified
         |-onset       privative  True if the segment is part of the onset
         |-Rime        privative  True if the segment is part of the Rime (i.e. nucleus or coda)
            |-nucleus  privative  True if the segment is the nucleus
            |-coda     privative  True if the segment is part of the coda
        SON            scalar     An integer indicating the calculated sonority of the segment

The module will set these features so that subsequent processing by Lingua::Phonology::Rules will correctly split the word up into domains or tiers on these features.

The algorithm and parameters used to syllabify an input word are described in the "ALGORITHM" and "PARAMETERS" sections.

METHODS

This section lists the methods not associated with any particular parameter. The items in the "PARAMETERS" section also have methods associated with them.

new

    $syll = Lingua::Phonology::Syllable->new();

Returns a new Lingua::Phonology::Syllable object. Takes no arguments.

syllabify

    $syll->syllabify(@word);

Syllabifies an input word. The arguments to syllabify() should be a list of Lingua::Phonology::Segment objects. Those segments will be set to have the feature values named above (SYLL, Rime, onset, nucleus, coda), according to the current syllabification parameters.

Note that if you're using this method as part of a Lingua::Phonology::Rules rule, then the following is almost certainly wrong:

        # Assume that we have a Rules object $rules and Syllable object $syll already
        $rules->add_rule(
                Syllabify => {
                        do => sub { $syll->syllabify(@_) }
                }
        );

The preceding rule will needlessly resyllabify the word once for every segment in the input word. This can be avoided with a simple addition.

        $rules->add_rule(
                Syllabify => {
                        direction => 'rightward',
                        where => sub { $_[-1]->BOUNDARY },
                        do => sub { $syll->syllabify(@_) }
                }
        );

This rule does a simple check to see if it's the first segment in the word, and then syllabifies. Syllabification only then happens once each time you apply the rule.

count_syll

    $sylls = $syll->count_syll;

This is a simple data-collection method that takes no arguments. It returns the number of syllables created in the most recent call to syllabify.

count_unparsed

    $unparsed = $syll->count_unparsed;

This is another data-collection method that takes no arguments. It returns the number of segments that were left unparsed in the most recent call to syllabify.

sonority

    $sonority = $syll->sonority($segment);

Takes a single Lingua::Phonology::Segment object as its argument, and returns an integer indicating the current calcuated sonority of the segment. The integer returned depends on the current value of the sonorous property. See "sonorous" for more information.

ALGORITHM

Syllabification algorithms are well-established in linguistic literature; this module merely implements the general view. Syllabification proceeds in several steps, the maximum expression of which is given below. Lingua::Phonology::Syllable may optimize away some of these steps if the current parameter settings warrant.

Clearing and calculating sonority

At the beginning of any syllabification, the existing syllabification for a segment is cleared if that segment meets the conditions in the clear_seg parameter. The sonority for all segments is also calculated according to the properties of the sonorous parameter.

Core syllabification

In this step, basic CV syllables are formed. Nuclei are assigned to segments that are of equal or greater sonority than both adjacent segments, and which at least as sonorous as the minimum nucleus sonority (min_nucl_son). The segments to the left of nuclei are assigned as onsets if onsets are allowed (defined by onset), they are not more sonorous than the maximum edge sonority (max_edge_son), and they have not already been assigned as nuclei.

Complex onset formation

Complex onsets are formed if they are allowed (defined by complex_onset). As many segments as possible are taken into the onset of the existing syllables, provided that they do not violate the minimum sonority distance in the onset (onset_son_dist) and do not exceed the maximum edge sonority.

Coda formation

Codas are formed if they are allowed (defined by coda). A segment to the left of a nucleus will be assigned to a coda if it has not already been syllabified as an onset, is less sonorous than the maximum edge sonority, and is at least as sonorous as the minimum coda sonority (min_coda_son).

Complex coda formation

Complex codas are formed if they are allowed (defined by complex_coda). As many segments as possible are taken into the coda, so long as they do not violate the minimum sonority distance and meet the same conditions imposed on regular codas.

Beginning adjunction

Segments at the very beginning of a word may be added to the initial syllable if special conditions apply. As many segments as possible will be added to the onset of the initial syllable if there are no syllabified segments between them and the left edge of the word, and if they meet the conditions imposed by the begin_adjoin parameter.

End adjunction

Segments at the very end of a word may be added to the coda of a final syllable under similar conditions. As many segments as possible will be added to the final syllable if for each of them there are no syllabified segments between them and the right edge of the word, and if they meet the conditions imposed in the end_adjoin parameter.

PARAMETERS

These parameters are used to determine the behavior of the syllabification algorithm. They are all accessible with a variety of get/set methods. The significance of the parameters and the methods used to access them are described below.

onset

Boolean, default true.

    # Return the current setting
    $syll->onset;

    # Allow onsets
    $syll->onset(1);
    $syll->set_onset;

    # Disallow onsets
    $syll->onset(0);
    $syll->no_onset;

If this parameter is true, onsets are allowed. When nuclei are formed, the segment preceding the nucleus will be taken as the onset of the syllable if other parameters allow. Note that pretty much all languages allow onsets.

complex_onset

Boolean, default false.

        # Return the current setting
        $syll->complex_onset;

        # Allow complex onsets
        $syll->complex_onset(1);
        $syll->set_complex_onset;

        # Disallow complex onsets
        $syll->complex_onset(0);
        $syll->no_complex_onset;

If this parameter is true, then complex onsets are allowed. The syllabification algorithm will greedily take as many segments as possible into the onset, provided that minimum sonority distance and maximum edge sonority are respected.

coda

Boolean, default false.

        # Return the current setting
        $syll->coda;

        # Allow codas
        $syll->coda(1);
        $syll->set_coda;

        # Disallow codas
        $syll->coda(0);
        $syll->no_coda;

If this parameter is true, then a single coda consonant is allowed.

complex_coda

Boolean, default false.

        # Return the current setting
        $syll->complex_coda; 

        # Allow complex codas
        $syll->complex_coda(1);
        $syll->set_complex_coda;

        # Disallow complex codas
        $syll->complex_coda(0);
        $syll->no_complex_coda;

If this parameter is true, then complex codas are allowed. Setting this parameter has no effect unless coda is also set. The algorithm will greedily take as many consonants as possible into the coda, provided that minimum sonority distance, maximum edge sonority, and minimum coda sonority are respected.

min_son_dist

Integer, default 1.

        # Return the current value
        $syll->min_son_dist;

        # Set the value
        $syll->min_son_dist(2);

This determines the minimum sonority distance between members of a coda or onset. Within a coda or onset, adjacent segments must differ in sonority by at least this amount. Setting this value sets both coda_son_dist and onset_son_dist (see below). This has no effect unless complex_onset or complex_coda is set to true. The default value is 1, which means that stop + nasal sequences like /kn/ will be valid onsets (if complex_onset is true);

coda_son_dist

Integer, default 1

    # Return the current value
    $syll->coda_son_dist;

    # Set the value
    $syll->coda_son_dist(2);

This parameter allows you finer control over the minimum sonority distance by allowing you to set the minimum sonority distance in codas separately from onsets. This sets the minimum sonority difference between adjacent segments in codas only.

onset_son_dist

Integer, default 1

    # Return the current value
    $syll->onset_son_dist;

    # Set the value
    $syll->onset_son_dist(2);

This parameter allows you finer control over the minimum sonority distance by allowing you to set the minimum sonority distance in codas separately from onsets. This sets the minimum sonority difference between adjacent segments in onsets only.

min_coda_son

Integer, default 0.

        # Return the current value
        $syll->min_coda_son;

        # Set the value;
        $syll->min_coda_son(2);

This determines the minimum coda sonority. Coda consonants must be at least this sonorous in order to be made codas. This is an easy way to, for example, allow only liquids and glides in codas. The default value is for anything to be allowed in a coda if codas are allowed at all.

max_edge_son

Integer, default 100

        # Return the current value
        $syll->max_edge_son;

        # Set the value;
        $syll->max_edge_son(2);

This determines the maximum edge sonority. Segments that are more sonorous than this value are required to be nuclei, no matter what other factors might intervene. This is an easy way to, for example, prevent high vowels from being made into glides. The default value (100) is simply set to a very high number to imply no particular restriction on what may be an onset or coda.

min_nucl_son

Integer, default 3.

        # Return the current value
        $syll->min_nucl_son;

        # Set the value;
        $syll->min_nucl_son(2);

This determines the min nucleus sonority. Segments that are less sonorous than this cannot be nuclei, no matter what other factors intervene. This is useful to rule out syllabic nasals and liquids. The default value (3) is set so that only vocoids can be nuclei. If you change which features count towards sonority, this will of course change the significance of the sonority value 3. Therefore, if you change sonorous(), you should consider if you need to change this value.

direction

String, default 'rightward'.

        # Return the current value
        $syll->direction;

        # Set the value
        $syll->direction('leftward');

This determines the direction in which core syllabification proceeds: L->R or R->L. Since syllable lines are not redrawn after the core syllabification, this can have important consequences for which segments are nuclei and which are onsets and codas if there is some ambiguity. This chart gives some examples:

        Outcomes for various scenarios, based on direction
          Input word          rightward          leftward
        
        No complex onsets or codas
          /duin/               <du><i>n          d<wi>n
        
        Codas, no complex onsets
          /duin/               <duj>n            d<win>
        
        Complex onsets and complex codas
          /duin/               <dujn>            <dwin>

sonorous

Hash reference, default:

        {
                sonorant => 1,
                approximant => 1,
                vocoid => 1,
                aperture => 1
        }

This is used to calculate the sonority of segments in the word. The value returned or passed into this method is a hash reference. The keys of this reference are the names of features, and the values are the amounts by which sonority is to be increased or decreased if the segment tests true for those features.

This method returns a hash reference containing all of the current key => value pairs. If you pass it a hash reference as an argument, that hash reference will replace the current one. I often find that for modifying the existing hash reference, it's easiest to use syntax like $syll->sonorous->{feature} to retrieve a the value for a single key, or $syll->sonorous->{feature} = $val to set a single value.

Note that the sonority() method only tests to see whether the feature values given as keys are true. There is no way to test for a particular scalar value. If you want to increase sonority in the case that a particular feature is false, simply set the value for that feature to be -1. E.g. if you were using the feature [consonantal] in place of [vocoid], you would want to say $syll->sonorous->{consonantal} = -1.

The default settings for sonorous(), together with the default feature set defined in Lingua::Phonology::Features, define the following sonority classes and values:

        0: Stops and fricatives
        1: Nasals
        2: Liquids
        3: High Vocoids
        4: Non-high vocoids

clear_seg

Code, default clears all segs.

        # Return the current value
        $syll->clear_seg;

        # Set the value
        $syll->clear_seg(\&foo);

This sets the conditions under which a segment should have its syllabification values cleared and should be re-syllabified from scratch. The default value is for every segment to be cleared every time. The code reference passed to clear_seg should follow the same rules as one for the where property for a rule in Lingua::Phonology::Rules.

end_adjoin

Code, default sub {0}.

        # Return the current value
        $syll->end_adjoin;

        # Set the value
        $syll->end_adjoin(\&foo);

This sets the conditions under which a segment may be adjoined to the end of a word. The default is for no end-adjunction at all. The code reference passed to end_adjoin() should follow the same rules as one for the where property of a rule in Lingua::Phonology::Rules. Note that additional constraints other than the ones present in the code reference here must be met in order for end-adjunction to happen, as described in the "ALGORITHM" section.

begin_adjoin

Code, default sub {0}.

        # Return the current value
        $syll->begin_adjoin;

        # Set the value
        $syll->begin_adjoin(\&foo);

This sets the conditions under which a segment may be adjoined to the beginning of a word. The default is for no beginning-adjunction at all. The code reference passed to begin_adjoin() should follow the same rules as one for the where property of a rule in Lingua::Phonology::Rules. Note that additional constraints other than the ones present in the code reference here must be met in order for beginning-adjunction to happen, as described in the "ALGORITHM" section.

AUTHOR

Jesse S. Bangs <jaspax@cpan.org>

LICENSE

This module is free software. You can distribute and/or modify it under the same terms as Perl itself.