AI::Classifier::Text::Analyzer - computing feature vectors from documents
use AI::Classifier::Text::Analyzer; my $analyzer = AI::Classifier::Text::Analyzer->new(); my $features = $analyzer->analyze( 'aaaa http://www.example.com/bbb?xx=yy&bb=cc;dd=ff' );
Computes feature vectors of text using some heuristics and adds words count (using Text::WordCounter by default).
The object is immutable - but some methods use a second parameter as an accumulator for the features found in given text.
It uses some specific values and methods that work for our case - but are not guaranteed to bring good results universally - see the source for details!
Object with a word_count method that will calculate the frequency of words in a text document. By default Text::WordCounter.
The weight assigned for computed features of the text document. By default 2.
new(word_counter => $foo, global_feature_weight => 3)
Creates a new AI::Classifier::Text::Analyzer object. Both arguments are optional.
Computes the feature vector of the given document and adds the initial vector of
Computes a vector special url related features of a given text - currently there are used
Removes html related parts from the text.
Zbigniew Lukasiak <email@example.com>, Tadeusz Sośnierz <firstname.lastname@example.org>
This software is copyright (c) 2012 by Opera Software ASA.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.