Text::Query::Simple - Match text against simple query expression and return relevance value for ranking
use Text::Query::Simple; # Constructor $query = Text::Query::Simple->new([QSTRING] [OPTIONS]); # Methods $query->prepare(QSTRING [OPTIONS]); $query->match([TARGET]); $query->matchscalar([TARGET]);
This module provides an object that tests a string or list of strings against a query expression similar to an AltaVista "simple query" and returns a "relevance value." Elements of the query expression may be regular expressions or literal text, and may be assigned weights.
Query expressions are compiled into an internal form when a new object is created or the prepare method is called; they are not recompiled on each match.
prepare
Query expressions consist of words (sequences of non-whitespace), regexps or phrases (quoted strings) separated by whitespace. Words or phrases prefixed with a + must be present for the expression to match; words or phrases prefixed with a - must be absent for the expression to match.
+
-
A successful match returns a count of the number of times any of the words (except ones prefixed with -) appeared in the text. This type of result is useful for ranking documents according to relevance.
Words or phrases may optionally be followed by a number in parentheses (no whitespace is allowed between the word or phrase and the parenthesized number). This number specifies the weight given to the word or phrase; it will be added to the count each time the word or phrase appears in the text. If a weight is not given, a weight of 1 is assumed.
use Text::Query::Simple; my $q=new Text::Query::Simple('+hello world'); die "bad query expression" if not defined $q; $count=$q->match; ... $q->prepare('goodbye adios -"ta ta",-litspace=>1); #requires single space between the two ta's if ($q->match($line,-case=>1)) { #doesn't match "Goodbye" ... $q->prepare('\\bintegrate\\b',-regexp=>1); #won't match "disintegrated" ... $q->prepare('information(2) retrieval'); #information has twice the weight of retrieval
This is the constructor for a new Text::Query::Simple object. If a QSTRING is given it will be compiled to internal form.
QSTRING
OPTIONS are passed in a hash like fashion, using key and value pairs. Possible options are:
OPTIONS
-case - If true, do case-sensitive match.
-litspace - If true, match spaces (except between operators) in QSTRING literally. If false, match spaces as \s+.
\s+
-regexp - If true, treat patterns in QSTRING as regular expressions rather than literal text.
-whole - If true, match whole words only, not substrings of words.
The constructor will return undef if a QSTRING was supplied and had illegal syntax.
undef
Compiles the query expression in QSTRING to internal form and sets any options (same as in the constructor). prepare may be used to change the query expression and options for an existing query object. If OPTIONS are omitted, any options set by a previous call to the constructor or prepare remain in effect.
This method returns a reference to the query object if the syntax of the expression was legal, or undef if not.
If TARGET is a scalar, match returns the number of words in the string specified by TARGET that match the query object's query expression. If TARGET is not given, the match is made against $_.
TARGET
match
$_
If TARGET is an array, match returns a list of references to anonymous arrays consisting of each element followed by its match count. The list is sorted in descending order by match count. If the elements of TARGET were anonymous arrays, the match count is appended to each element. This allows arbitrary information (such as a filename) to be associated with each element.
If TARGET is a reference to an array, match returns a reference to a sorted list of matching items, with counts, for all elements.
Behaves just like MATCH when TARGET is a scalar or is not given. Slightly faster than MATCH under these circumstances.
MATCH
This module requires Perl 5.005 or higher due to the use of evaluated expressions in regexes
Eric Bohlman (ebohlman@netcom.com)
The parse_tokens routine was adapted from the parse_line routine in Text::Parsewords.
Copyright (c) 1998 Eric Bohlman. All rights reserved. This program is free software; you can redistribute and/or modify it under the same terms as Perl itself.
To install Text::Query::Simple, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Text::Query::Simple
CPAN shell
perl -MCPAN -e shell install Text::Query::Simple
For more information on module installation, please visit the detailed CPAN module installation guide.