Shlomo Yona > SuffixTree-0.02 > SuffixTree

Download:
SuffixTree-0.02.tar.gz

Dependencies

Annotate this POD

CPAN RT

Open  0
View/Report Bugs
Module Version: 0.02   Source   Latest Release: SuffixTree-0.07

NAME ^

SuffixTree - Efficient string manipulation data structure interface for Perl.

SYNOPSIS ^

        use SuffixTree;

        my $str = "mississippi";
        my $tree=ST_CreateTree($str, length($str));
        ST_PrintTree($tree);
        my $position = ST_FindSubstring($tree, "ssis", 4);

        printf("\nPosition of ssis in mississippi is %ld.\n\n", $position);

        ST_DeleteTree($tree);

DESCRIPTION ^

The intention of this project is to provide an open-source implementation for an efficient data structure for strings manipulation - the Suffix Tree.

The code was written with as much consistency with the theoretical algorithm as possible (see references). It provides a set of interface functions for creating and searching the tree.

The suffix tree is implemented in ANSI-C. The code is written based on the original algorithm by E.Ukkonen. This is the Perl interface for the underlying ANSI-C implementation.

FUNCTIONS ^

All functions are exported by default. Please note that all these interface functions were automatically extracted from the ANSI-C header file, so they might not behave as Perlish as you'd expect them to. This is something we will definitly address in the future.

$tree = ST_CreateTree($string, length($string))

Allocates memory for the tree and starts Ukkonen's construction algorithm. Parameters: A string, length of the string. Returns a reference to the tree.

$position = ST_FindSubstring($tree, $substring, length($substring))

Searches for a string in the tree. It traverses the tree down starting its root like in a regular trie. Parameters: the tree to search in, a substring to look for, length of substring. Returns the position it was found in the source string or ST_ERROR if string is not in the tree. NOTE: We did not make sure how ST_ERROR 'looks like' in Perl. This should be further explained in future releases.

ST_PrintTree($tree)

Prints the tree. Parameters: the tree to print.

ST_DeleteTree($tree)

Deletes a suffix tree. Parameters: the tree to delete. Returns : void.

ST_SelfTest($tree)

Tests a suffix tree - search for all substrings of the main string (see Testing Guide in the README file). Parameters: the tree to test. Returns : 1 for success, 0 for failure.

BUGS ^

This Perl interface was mostly build automatically (using SWIG). Little to no attention was given to testing. In future relases of this Perl Module (along with its underlying ANSI-C implementation) we hope to fix all problems that might currenly interfere with successful usage of this module. Please send bug reports to the author(s) of this module.

FUTURE WORK ^

[1] A better Perl-ish interface

[2] Building tests for this module (for the `make test` part of the installation)

[3] Object Oriented like usage

SEE ALSO ^

        http://cs.haifa.ac.il/~shlomo/suffix_tree/

AUTHOR ^

Shlomo Yona <shlomo@cs.haifa.ac.il>

THANKS TO ^

Offer Kaye (http://okaye.esmartweb.com/) for useful ideas and support.

COPYRIGHT ^

Copyright (c) 2002, 2003 Shlomo Yona. All rights reserved.

This library is free software. You can redistribute it and/or modify it under the same terms as Perl itself.

syntax highlighting: