翁書鈞 > Text-JaroWinkler-0.1 > Text::JaroWinkler

Download:
Text-JaroWinkler-0.1.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 0.1   Source  

NAME ^

Text::JaroWinkler - An implementation of the Jaro-Winkler distance

SYNOPSIS ^

  use Text::JaroWinkler qw( strcmp95 );

  print strcmp95("it is a dog","i am a dog.",11);
  # print "0.865619834710744"

DESCRIPTION ^

This module implements the Jaro-Winkler distance. The Jaro-Winkler distance is a measure of similarity between two strings. It is a variant of the Jaro distance metric and mainly used in the area of record linkage (duplicate detection). The higher the Jaro-Winkler distance for two strings is, the more similar the strings are. The Jaro-Winkler distance metric is designed and best suited for short strings such as person names. The score is normalized such that 0 equates to no similarity and 1 is an exact match. More information can be found on <http://en.wikipedia.org/wiki/Jaro-Winkler>

It is an XS wrapper of the original C implementation by the author of the algorithm: <http://www.census.gov/geo/msb/stand/strcmp.c>, with some minor modification to accept variance length input.

EXPORT

None by default.

AUTHOR ^

Shu-Chun Weng <scw@csie.org>

SEE ALSO ^

perl, Text::Levenshtein, Text::LevenshteinXS, Text::WagnerFischer

syntax highlighting: