AI::Classifier::Text::Analyzer - computing feature vectors from documents
version 0.03
use AI::Classifier::Text::Analyzer; my $analyzer = AI::Classifier::Text::Analyzer->new(); my $features = $analyzer->analyze( 'aaaa http://www.example.com/bbb?xx=yy&bb=cc;dd=ff' );
Computes feature vectors of text using some heuristics and adds words count (using Text::WordCounter by default).
The object is immutable - but some methods use a second parameter as an accumulator for the features found in given text.
It uses some specific values and methods that work for our case - but are not guaranteed to bring good results universally - see the source for details!
word_counter
Object with a word_count method that will calculate the frequency of words in a text document. By default Text::WordCounter.
global_feature_weight
The weight assigned for computed features of the text document. By default 2.
new(word_counter => $foo, global_feature_weight => 3)
Creates a new AI::Classifier::Text::Analyzer object. Both arguments are optional.
analyze($document, $features)
Computes the feature vector of the given document and adds the initial vector of $features.
$features
analyze_urls($document, $features)
Computes a vector special url related features of a given text - currently there are used NO_URLS, MANY_URLS and REPEATED_URLS features.
NO_URLS
MANY_URLS
REPEATED_URLS
filter($document)
Removes html related parts from the text.
AI::NaiveBayes (3), AI::Classifier::Text(3)
Zbigniew Lukasiak <zlukasiak@opera.com>, Tadeusz Sośnierz <tsosnierz@opera.com>
This software is copyright (c) 2012 by Opera Software ASA.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
1 POD Error
The following errors were encountered while parsing the POD:
Non-ASCII character seen before =encoding in 'Sośnierz'. Assuming UTF-8
To install AI::Classifier::Text, copy and paste the appropriate command in to your terminal.
cpanm
cpanm AI::Classifier::Text
CPAN shell
perl -MCPAN -e shell install AI::Classifier::Text
For more information on module installation, please visit the detailed CPAN module installation guide.