Chris Winters > OpenInteract-1.99_06 > OpenInteract2::FullTextRules

Download:
OpenInteract-1.99_06.tar.gz

Dependencies

Annotate this POD

CPAN RT

Open  0
Report a bug
Module Version: 1.04   Source  

NAME ^

OpenInteract2::FullTextRules - Rules for automatically indexing SPOPS objects

SYNOPSIS ^

 # In object's spops.ini file tell OI2 you want your objects to be
 # indexed; with this all 'save()' calls to the object will trigger
 # the object's 'description' and 'title' fields being indexed.
 
 [myobj]
 is_searchable = yes
 fulltext_field = description
 fulltext_field = title

METHODS ^

SPOPS Ruleset

ruleset_add( $class, \%ruleset_table )

Adds the necessary rules to the $class that puts this class in its ISA. Currently, these rules consist of:

Internal

_indexable_object_text()

Gets the text out of the object to index. Currently, we treat all text from the object as one big field.

Note that if you have defined 'fulltext_pre_index_method' as a configuration item in your class it is called before indexing. This is useful if you have a method to fetch external data into your object.

_tokenize( $text )

Breaks text down into tokens. This process is very simple. First we break the text into words, then we lower case each word, then we 'stem' each word. Here is a brief description of stemming:

 Truncation - Also referred to as "root/suffix management" or
 "Stemming" or "Word Stemming", truncation allows some search engines
 to recognize and shorten long words such as "plants" or "boating" to
 their root words (or word stems) "plant" and "boat." This makes
 searching for such words much easier because it is not necessary to
 consider every permutation of that word when trying to find it.1 In a
 search, the ability to enter the first part of a keyword, insert a
 symbol (usually *), and accept any variant spellings or word endings,
 from the occurrence of the symbol forward (e.g., femini* retrieves
 feminine, feminism, feminism, etc.).3 See also word variants, plurals
 and singulars.

(From: http://ollie.dcccd.edu/library/Module2/Books/concepts.htm)

We use the Lingua::Stem module for this, which implements the Porter algorithm for stemming, as do most implementations, apparently. (This is something that this class treats as a black box itself :)

Parameters:

SEE ALSO ^

OpenInteract2::FullTextIndexer in the 'full_text' package

COPYRIGHT ^

Copyright (c) 2004-2005 Chris Winters. All rights reserved.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

AUTHORS ^

Chris Winters <chris@cwinters.com>