The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

RCS file: RCS/Similars.pm,v
Working file: Similars.pm
head: 1.27
branch:
locks: strict
access list:
symbolic names:
keyword substitution: kv
total revisions: 27;	selected revisions: 27
description:
Possibly identical/similar file locator
----------------------------
revision 1.27
date: 2008/10/31 16:07:34;  author: tong;  state: Exp;  lines: +3 -3
- cosmetic amend to module identification
----------------------------
revision 1.26
date: 2008/10/31 15:59:05;  author: tong;  state: Exp;  lines: +4 -3
- cosmetic amend: show explicitly how the test is invoked.
----------------------------
revision 1.25
date: 2008/10/29 16:59:40;  author: tong;  state: Exp;  lines: +4 -4
- update contact email
----------------------------
revision 1.24
date: 2008/10/29 16:47:22;  author: tong;  state: Exp;  lines: +86 -37
- update test & output, therefore update pod as well.
----------------------------
revision 1.23
date: 2008/10/28 23:05:39;  author: tong;  state: Exp;  lines: +5 -2
- version deprecated. Future versions are released as File::Find::Similars.
----------------------------
revision 1.22
date: 2006/12/25 23:59:33;  author: tong;  state: Exp;  lines: +25 -3
- add process_stdin
- tested fine

 find . \( -type f -o -type l \) -follow -printf "%p\t%s\n" | fileSimilars.pl -
----------------------------
revision 1.21
date: 2006/12/25 22:29:19;  author: tong;  state: Exp;  lines: +4 -4
- bug fix, $fc_level=0 to compare files between dirs now works
----------------------------
revision 1.20
date: 2006/12/25 22:17:29;  author: tong;  state: Exp;  lines: +6 -6
- s/process_files/process_entries/g
----------------------------
revision 1.19
date: 2006/12/25 22:08:29;  author: tong;  state: Exp;  lines: +19 -20
- move sub process_files, nothing else
----------------------------
revision 1.18
date: 2003/08/13 04:30:46;  author: tong;  state: Exp;  lines: +57 -20
- more accurate similarity check.
 - fine tune config variables
 - bug fix for negative fdsize
 - bug fix for empty soundex array
- add 3rd array wash function, for language specific washing
- add Chinese support
----------------------------
revision 1.17
date: 2003/08/10 16:32:32;  author: tong;  state: Exp;  lines: +6 -6
- document enhancement
- for public release v1.3
----------------------------
revision 1.16
date: 2003/08/10 16:25:51;  author: tong;  state: Exp;  lines: +12 -2
- document enhancement
- for public release v1.3
----------------------------
revision 1.15
date: 2003/08/10 16:05:25;  author: tong;  state: Exp;  lines: +100 -57
- quote the dir names
- new algorithm for similarity check
  p *  soudex + (1-p) * fSize
- introduced hash variable %config
- move the pod to fileSimilars. Rewrite its own pod.
- fix bug, able to invoke again, w/o duplicate file entries
- add LICENSE
- for public release v1.3
----------------------------
revision 1.14
date: 2002/12/26 18:12:59;  author: tong;  state: Exp;  lines: +53 -15
- minior bug fix
- main on document enhancement
----------------------------
revision 1.13
date: 2002/12/26 01:52:16;  author: tong;  state: Exp;  lines: +47 -19
- clean up debug messages
- quote the file names
- the most robust soundex handling that support all cases, eg 'aa bb.cc'
- introduced two arrwash_ functions for customization
----------------------------
revision 1.12
date: 2002/12/25 23:24:04;  author: tong;  state: Exp;  lines: +6 -4
- Chinese file names friendly. Previously "Illegal division by zero" when met with Chinese file names.
----------------------------
revision 1.11
date: 2002/12/25 23:10:54;  author: tong;  state: Exp;  lines: +19 -17
- bug fix, more robust. Fix 3 bugs for file names like aa.mp3, 1.mp3:
 * file extensions
 * decompose SuchKindOfWord
 * short file names

- clearer varible names and tech docs
----------------------------
revision 1.10
date: 2002/09/16 22:55:09;  author: tong;  state: Exp;  lines: +10 -10
- change package name from File::Find::Similars to File::Searcher::Similars
----------------------------
revision 1.9
date: 2002/09/16 22:16:36;  author: tong;  state: Exp;  lines: +2 -4
- remove dependency on fdispatch
----------------------------
revision 1.8
date: 2001/09/26 02:19:34;  author: tong;  state: Exp;  lines: +6 -5
- explain columns
----------------------------
revision 1.7
date: 2001/09/26 01:53:27;  author: tong;  state: Exp;  lines: +67 -151
- change to .pm module
----------------------------
revision 1.6
date: 2001/09/22 04:08:21;  author: tong;  state: Exp;  lines: +41 -18
First public release
----------------------------
revision 1.5
date: 2001/09/21 01:38:55;  author: tong;  state: Exp;  lines: +6 -5
- treat	fileInfos[$ii] the same as fileInfos[$jj]
----------------------------
revision 1.4
date: 2001/09/21 01:33:17;  author: tong;  state: Exp;  lines: +62 -37
- increase us-ability
- --level=1 checking
- more documentation
----------------------------
revision 1.3
date: 2001/09/20 01:18:01;  author: tong;  state: Exp;  lines: +21 -14
- new diff algorithm, name difference is most significant now.
- output file by file size instead of unseen difference level
----------------------------
revision 1.2
date: 2001/09/20 00:07:33;  author: tong;  state: Exp;  lines: +20 -11
All 1st release functionality ready.
----------------------------
revision 1.1
date: 2001/09/19 03:34:25;  author: tong;  state: Exp;
Initial revision
=============================================================================