Martin Majlis > Lingua-YALI-0.012 > Lingua::YALI::Identifier

Download:
Lingua-YALI-0.012.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 0.012   Source   Latest Release: Lingua-YALI-0.015

NAME ^

Lingua::YALI::Identifier - Module for language identification with custom models.

VERSION ^

version 0.012

SYNOPSIS ^

This modul identify languages with moduls provided by the user. If you want to use pretrained models use Lingua::YALI::LanguageIdentifier.

Models trained on texts from specific domain outperforms the general ones.

    use Lingua::YALI::Builder;
    use Lingua::YALI::Identifier;

    # create models
    my $builder_a = Lingua::YALI::Builder->new(ngrams=>[2]);
    $builder_a->train_string("aaaaa aaaa aaa aaa aaa aaaaa aa");
    $builder_a->store("model_a.2_all.gz", 2);

    my $builder_b = Lingua::YALI::Builder->new(ngrams=>[2]);
    $builder_b->train_string("bbbbbb bbbb bbbb bbb bbbb bbbb bbb");
    $builder_b->store("model_b.2_all.gz", 2);

    # create identifier and load models
    my $identifier = Lingua::YALI::Identifier->new();
    $identifier->add_class("a", "model_a.2_all.gz");
    $identifier->add_class("b", "model_b.2_all.gz");

    # identify strings
    my $result1 = $identifier->identify_string("aaaaaaaaaaaaaaaaaaa");
    print $result1->[0]->[0] . "\t" . $result1->[0]->[1];
    # prints out a 1

    my $result2 = $identifier->identify_string("bbbbbbbbbbbbbbbbbbb");
    print $result2->[0]->[0] . "\t" . $result2->[0]->[1];
    # prints out b 1

More examples is presented in Lingua::YALI::Examples.

METHODS ^

BUILD

Initializes internal variables.

    # create identifier
    my $identifier = Lingua::YALI::Identifier->new();

add_class

    $added = $identifier->add_class($class, $model)

Adds model stored in file $model with class $class and returns whether it was added or not.

    print $identifier->add_class("a", "model.a1.gz") . "\n";
    # prints out 1
    print $identifier->add_class("a", "model.a2.gz") . "\n";
    # prints out 0 - class a was already added

remove_class

     my $removed = $identifier->remove_class($class);

Removes model for class $class.

    $identifier->add_class("a", "model.a1.gz");
    print $identifier->remove_class("a") . "\n";
    # prints out 1
    print $identifier->remove_class("a") . "\n";
    # prints out 0 - class a was already removed

get_classes

    my \@classes = $identifier->get_classes();

Returns all registered classes.

identify_file

    my $result = $identifier->identify_file($file)

Identifies class for file $file.

identify_string

    my $result = $identifier->identify_string($string)

Identifies class for string $string.

identify_handle

    my $result = $identifier->identify_handle($fh)

Identifies class for file handle $fh and returns:

SEE ALSO ^

AUTHOR ^

Martin Majlis <martin@majlis.cz>

COPYRIGHT AND LICENSE ^

This software is Copyright (c) 2012 by Martin Majlis.

This is free software, licensed under:

  The (three-clause) BSD License
syntax highlighting: