The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

Changes for version 0.007 - 2012-12-31

  • Stanislaw Pusep <creaktive@gmail.com>
    • updated benchmark results (yep, it is that faster, even on the lower clock!
    • less indirections
    • micro-optimizations for latin1
    • macro rampage
    • macros cleanup
    • .gitignore update
    • no more warnings under Clang
    • added murmur to benchmark for comparison
    • replaced sprintf() by custom itoa()

Documentation

compute cosine similarity between two documents
uses MinHash & SpeedyFx to compare large text data
efficiently count unique tokens from a file

Modules

tokenize/hash large amount of strings efficiently

Examples