The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
Tom Christiansen <tchrist@jhereg.perl.com> writes:
>  *  This is actually one of the harder ones.

Actually, it wasn't all that hard; there was just a lot of hammering
of square pegs into round holes.  (The differences between +pos1 -pos2
and -k pos1,pos2 are a pain in the rear, believe me.)

This version is still a little rough, but the main bits should be there.

The generic things that still need to be done include
 * a lot of testing and debugging!
 * documentation [I'm working on this]
 * support for more options
 * testing
 * optimizing the code
 * did I mention testing?

I suspect that there are plenty of bugs still lurking around, so if
you can test whether this behaves reasonably in places where you
normally use `sort', it would be most helpful.

Specific ideas for features involve
 * adding the external merge sort (Mark-Jason?)
 * a way to specify regular expressions for field delimiter
   [do we want a new option, or use -t (which raises quoting issues)?]
 * a way to specify negative character offsets (counting from the end
   of the string)
 * finding a way to make -i work without POSIX::isprint

Options that should work:
   -D       debugging (currently just shows split_sub and sort_sub)   

   +pos1 -pos2
            select key fields
   -k pos1,pos2
            select key fields (with different semantics of pos{1,2})

   -t CHAR  use CHAR as field delimiter
   -b       ignore leading whitespace
   -d       dictionaty order (compare letters/digits/blanks only)
   -f       fold upper case to lower
   -i       ignore non-printable characters (currently requires POSIX)
   -n       sort numerically
   -r       reverse sort
   -u       remove multiples of lines which compare as equal

Options that are (still) not supported:
   -c       check that the input-file is sorted
   -m       merge only, the input-files are already sorted
   -H       use a merge sort instead of a radix sort
   -M       compare as month names (hard to do wrt locales)
   -o FILE  write output to FILE instead of STDOUT
   -y KMEM  amount of main memory to use by default
   -R CHAR  set the record separator character (just need to set $/)

If you want more details on any of these, you should check out

> Version 7: http://language.perl.com/v7/sort.1
> BSD: http://www.openbsd.org/cgi-bin/man.cgi?query=sort&sektion=1&apropos=0&manpath=OpenBSD+Current
> POSIX: http://www.opengroup.org/onlinepubs/7908799/xcu/sort.html

--Albert Dvornik
  <bert@genscan.com>