The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Win32::CommandLine - Retrieve and reparse the Win32 command line

VERSION

 $Win32::CommandLine::VERSION = '0.960';  # Win32-CommandLine-0.960

SYNOPSIS

 @ARGV = Win32::CommandLine::argv() if eval { require Win32::CommandLine; };

or

 use Win32::CommandLine qw( command_line parse );
 my $commandline = command_line();
 my @args = parse( $commandline );

DESCRIPTION

This module provides a simple way for any perl script to reread and reparse the windows command line, adding improved parsing and more robust quote mechanics, augmented with powerful bash-like shell enhancements (including brace and tilde expansion, quote and subshell command substitution, and file name glob expansion).

Use of the companion script, xx.bat (along with doskey), can, transparently, grant those same features to the command line interface of any windows executable.

Command line expansion

1. Brace expansion

Brace expansion is performed before any other expansions, and any characters special to other expansions are preserved in the result. It is strictly textual.

2. Tilde expansion
 ~           Current user home directory
 ~USERNAME   Home directory of USERNAME
 ~TEXT       Environment variable named ~TEXT (aka $ENV{~TEXT}; overrides ~USERNAME expansion)
3. Quote and subshell command substitution
 '...'    literal content (no escapes and no globbing within quotes)
 "..."    literal content (no escapes and no globbing within quotes) (see *NOTE-1)
 $"..."   literal content (no escapes and no globbing within quotes) [same as "..."; same as C<bash>]

 $'...'   interpreted content, including ANSI C string escapes (see *NOTE-2); no globbing within quotes

 $( ... ) command substitution (see *NOTE-3)
 $("...") command substitution (quotes removed; see *NOTE-4)

NOTE-1: DOS character escape sequences (such as "\"") are parsed prior to being put into the command line and, so, are valid and still (unavoidably) interpreted within double-quotes.

NOTE-2: ANSI C string escapes are \a, \b, \e, \f, \n, \r, \t, \v, \\, \', \", octal (\[0-9]{1,3}), hexadecimal (\x[0-9a-fA-F]{1,2}), and control characters (\c[@A-Z[\\\]^_`?); all other escaped characters are left in place without transformation (\<x> => \<x>).

NOTE-3: Command substitution replaces the $(...) argument with the standard output of that argument's execution. Command substitution strings are not, themselves, automatically expanded; use $(xx *COMMAND*) to trigger further expansion of the subshell command line.

NOTE-4: $("...") is present to enable delayed DOS/Windows interpretation of redirection & continuation characters. This allows redirection & continuation characters to be used within the subshell command string.

4. Glob expansion and meta-characters
 ?           Match any single character
 *           Match any string of characters
 []          Character class (including POSIX and negated classes)
 \           Quote the next metacharacter

The multiple pattern meta-notation 'a{b,c,d}e' is a shorthand for 'abe ace ade'. Left to right order is preserved, with results of matches being sorted separately at a low level to preserve this order.

Examples

...

ref: bash ANSI-C Quoting@

xx.bat Usage

 doskey type=call xx type $*
 type [a-c]*.pl

 doskey perl=call xx perl $*
 perl -e 'print "test"'     &@:: would otherwise FAIL

 doskey cpan=call xx cpan $*
 cpan $(dzil listdeps)      &@:: with a CPAN wrapper program

 @:: print all files in current directory [appropriately quoted for the CMD shell]
 xx -e *

 @:: * assumes `ls` is installed
 @:: print all directories in current directory
 xx -e $(" ls -ALp --quoting-style=c --color=no . | grep --color=no [\\/]$ ")

 @:: print all files (non-directories) in current directory
 xx -e $(" ls -ALp --quoting-style=shell --color=no . | grep --color=no -v [\\/]$ ")

INSTALLATION

To install this module, run the following commands:

 perl Build.PL
 perl Build
 perl Build test
 perl Build install

This is minor modification of the usual perl build idiom. This version is portable across multiple platforms.

Alternatively, the standard make idiom is also available (although it is deprecated):

 perl Makefile.PL
 make
 make test
 make install

On Windows platforms, when using this make idiom, replace "make" with the result of 'perl -MConfig -e "print $Config{make}"' (usually, either dmake, gmake, or nmake).

Note that the Makefile.PL script is just a pass-through, and Module::Build is still ultimately required for installation. Makefile.PL will throw an exception if Module::Build is missing from your current installation. cpan will notify the user of the build prerequisites (and install them for the build, if it is setup to do so [see the cpan configuration option build_requires_install_policy]).

PPM installation bundles should also be available in the standard PPM repositories (eg, ActiveState, etc.).

Note: for ActivePerl installations, 'perl ./Build install' will do a full installation using ppm. During the installation, a PPM package is constructed locally and then subsequently used for the final module install. This allows for uninstalls (by using 'ppm uninstall Win32::CommandLine') and also keeps local HTML documentation current.

INTERFACE

argv( [\%options] ) => @ARGS

  • \%options : (optional) reference to hash containing function options

  • @ARGS : [return] revised argument array (may replace @ARGV)

Reparse & glob-expand the original command line, returning a new, revised argument array (which may be used as a drop-in replacement for @ARGV).

 @ARGV = argv();

command_line( ) => $S

  • $S : [return] the original command line for the process (as a string)

Use the Win32 API to recapture the original command line for the current process.

 my $commandline = command_line();

parse( $s [,\%options ] ) => @ARGS

  • $s : string argument to parse/expand

  • \%options : (optional) reference to hash containing function options

  • @ARGS : [return] parsed/expanded arguments

Parse & glob-expand a string argument; returns the results of parsed/expanded argument as an array.

 my @argv_new = parse( command_line() );

Function options ( \%options )

    my %options = (
        remove_exe_prefix => 1,     # = 0/<true> [default = <true>]
        dosquote => 0,              # = 0/<true>/'all' [default = 0]
        dosify => 0,                # = 0/<true>/'all' [default = 0]
        unixify => 0,               # = 0/<true>/'all' [default = 0]
        nullglob => defined($ENV{nullglob}) ? $ENV{nullglob} : 0,   # = 0/<true> [default = 0]
        glob => 1,                  # = 0/<true> [default = <true>]
        croak_unbalanced => 1,      # = 0/true/'quotes'/'subshells' [default = <true>]
        carp_unbalanced => 1,       # = 0/true/'quotes'/'subshells' [default = <true>]
        ## ToDO: add globstar option
        );
  • remove_exe_prefix

    Allowable values: 0, <true>

    Default value: <true>

    If <true>, remove all initial args up to and including the exe name from the returned @ARGS array.

  • dosquote

    Allowable values: 0, <true>, 'all'

    Default value: 0

    If <true>, convert all non-globbed ARGS to DOS/Win32 CLI compatible tokens (escaping internal quotes and quoting whitespace and special characters)

  • dosify

     dosify => 0,                # = 0/<true>/'all' [default = 0]

    - if true, convert all _globbed_ ARGS to DOS/Win32 CLI compatible tokens (escaping internal quotes and quoting whitespace and special characters); 'all' => do so for for _all_ ARGS which are determined to be files

  • unixify

     unixify => 0,               # = 0/<true>/'all' [default = 0]

    - if true, convert all _globbed_ ARGS to UNIX path style; 'all' => do so for for _all_ ARGS which are determined to be files

  • nullglob

     nullglob => defined($ENV{nullglob}) ? $ENV{nullglob} : 0,       # = 0/<true> [default = 0]

    - if true, patterns which match no files are expanded to a null string (no token), rather than the pattern itself ## $ENV{nullglob} (if it exists) overrides the default

  • glob

     glob => 1,                  # = 0/<true> [default = true]

    - when true, globbing is performed

  • croak_unbalanced

     croak_unbalanced => 1,      # = 0/true/'quotes'/'subshells' [default = true]

    - if true, croak for unbalanced command line quotes or subshell blocks (takes precedence over carp_unbalanced)

  • carp_unbalanced

     carp_unbalanced => 1,       # = 0/true/'quotes'/'subshells' [default = true]

    - if true, carp for unbalanced command line quotes or subshell blocks

    -for ToDO: add globstar option

RATIONALE

This began as a simple need to work-around the less-than-stellar COMMAND.COM/CMD.EXE command line parser, just to accomplish more "correct" quotation interpretation. It then grew into a small odyssey: learning XS and how to create a perl module, learning the perl build process and creating a customized build script/environment, researching tools and developing methods for revision control and versioning, learning and creating perl testing processes, and finally learning about PAUSE and perl publishing practices. And, somewhere in the middle, adding some of the bash shell magic to the CMD shell.

Some initial attempts were made using Win32::API and Inline::C. For example, a Win32::API attempt (which caused GPFs):

  @rem = '--*-Perl-*--
  @echo off
  if "%OS%" == "Windows_NT" goto WinNT
  perl -x -S "%0" %1 %2 %3 %4 %5 %6 %7 %8 %9
  goto endofperl
  :WinNT
  perl -x -S %0 %*
  if NOT "%COMSPEC%" == "%SystemRoot%\system32\cmd.exe" goto endofperl
  if %errorlevel% == 9009 echo You do not have Perl in your PATH.
  if errorlevel 1 goto script_failed_so_exit_with_non_zero_val 2>nul
  goto endofperl
  @rem ';
  #!/usr/bin/perl -w
  #line 15
  #
  use Win32::API;
  #
  Win32::API->Import("kernel32", "LPTSTR GetCommandLine()");
  my $string = pack("Z*", GetCommandLine());
  #
  print "string[".length($string)."] = '$string'\n";
  # ------ padding --------------------------------------------------------------------------------------
  __END__
  :endofperl

Unfortunately, Win32::API and Inline::C were shown to be too fragile at the time (in 2007). Win32::API caused occasional (but reproducible) GPFs, and Inline::C was shown to be very brittle on Win32 systems (i.e., not compensating for paths with embedded strings). (See http://www.perlmonks.org/?node_id=625182 for a more full explanation of the problem and initial attempts at a solution.)

So, an initial XS solution was implemented. And from that point, the lure of bash-like command line parsing led slowly, but inexorably, to the full implementation. The parsing logic is unfortunately still complex, but seems to be holding up well under testing.

IMPLEMENTATION and INTERNALS

This is a list of internal XS functions (brief descriptions will be added at a later date):

  SV * _wrap_GetCommandLine() // [XS] Use C and Win32 API to get the command line
  HANDLE _wrap_CreateToolhelp32Snapshot ( dwFlags, th32ProcessID )
  bool _wrap_Process32First ( hSnapshot, lppe )
  bool _wrap_Process32Next ( hSnapshot, lppe )
  bool _wrap_CloseHandle ( hObject )
  // Pass useful CONSTANTS back to perl
  int _const_MAX_PATH ()
  HANDLE _const_INVALID_HANDLE_VALUE ()
  DWORD _const_TH32CS_SNAPPROCESS ()
  // Pass useful sizes back to Perl (for testing) */
  unsigned int _info_SIZEOF_HANDLE ()
  unsigned int _info_SIZEOF_DWORD ()
  // Pass PROCESSENTRY32 structure info back to Perl
  SV * _info_PROCESSENTRY32 ()

CONFIGURATION and ENVIRONMENT

Win32::CommandLine requires no configuration files or environment variables.

OPTIONAL Environment Variables

NULLGLOB

Override the default glob expansion behavior for empty matches
     $ENV{NULLGLOB} = 1; # undef/''/0 | <true>

Default glob expansion, as in bash, expands glob patterns which match nothing into the glob pattern itself. Use $ENV{NULLGLOB} to override this default behavior.

Analogous to the bash command 'shopt -s nullglob', when $ENV{NULLGLOB} is set to a true (defined, non-NULL, non-zero) value, a glob expansion which matches nothing will expand to the null string (aka, q{}).

Note: the default glob expansion behavior can also be modified programmatically via the function option, nullglob, when passed to the argv() and parse() functions. This option, when passed to argv() or parse(), will override both the default behavior and the $ENV{NULLGLOB} setting.

DEPENDENCIES

Win32::CommandLine requires Carp::Assert for internal error checking and warnings.

The optional modules Win32, Win32::Security::SID, and Win32::TieRegistry are recommended to allow full glob tilde expansions for user home directories (eg, ~administrator expands to C:\Users\Administrator). Expansion of the single tilde (~) has a backup implementation based on %ENV variables, and therefore will still work even without the optional modules.

INCOMPATIBILITIES

None reported.

CAVEATS

Operational Notes

IMPORTANT NOTE: Special shell characters (shell redirection, '|', '<', '>', and continuation, '&') must be DOUBLE-quoted to escape shell interpretation (eg, "foo | bar"). The shell does initial parsing and redirection/continuation (stripping away everything after I/O redirection and continuation characters) before any process can get a look at the command line. So, the special shell characters can only be hidden from shell interpretation by quoting them with double-quote characters.

%<X>% is also replaced by the corresponding environment variable by the shell before handing the command line off to the OS. The caret ^ escape character can be used to break the interpretation when needed (eg, %^COMSPEC^% instead of %COMSPEC%).

Brackets ('{' and '}') and braces ('[' and ']') must be quoted (single or double quotes) to be matched literally. This may be a gotcha for some users, although if the filename has internal spaces, tab expansion of filenames for the standard Win32 shell (cmd.exe) or 4NT/TCC/TCMD will automatically surround the entire path with spaces (which corrects the issue).

Some programs may expect their arguments to maintain their surrounding quotes, but argv() parsing only quotes arguments which require it to maintain equivalence for shell parsing (i.e., those containing spaces, special characters, etc). And, since single quotes have no special meaning to the shell, all arguments which require quoting for correct shell interpretation will be quoted with double-quote characters, even if they were originally quoted with single-quotes. Neither of these issues should be a problem for programs using Win32::CommandLine, but may be an issue for 'legacy' applications which have their command line expanded with xx.bat.

Be careful with backslashed quotes within quoted strings. Note that "foo\" is an unbalanced string containing a double quote. Place the backslash outside of the quotation ("foo"\) or use a double backslash within ("foo\\") to include the backslash it in the parsed token. However, backslashes ONLY need to be doubled when placed prior to a quotation mark ("foo\bar" will work as expected).

SUPPORT

Bugs / Feature Requests

Please report any issues through the issue tracker at https://github.com/rivy/perl.Win32-CommandLine/issues. The developers will be notified, and you'll automatically be notified of progress on your issue.

Documentation

You can find documentation for this module with the perldoc command:

 perldoc Win32::CommandLine

Further information

ACKNOWLEDGEMENTS

Thanks to BrowserUK and syphilis (aka SISYPHUS on CPAN) for some helpful ideas (including an initial XS starting point for the module) during a discussion on PerlMonks.

AUTHOR

Roy Ivy III <rivy@cpan.org>

COPYRIGHT

 Copyright (c) 2007-2018, Roy Ivy III <rivy@cpan.org>. All rights reserved.

LICENSE

This module is free software; you can redistribute it and/or modify it under the Perl Artistic License v2.0.

DISCLAIMER OF WARRANTY

THIS PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

[REFER TO THE FULL LICENSE FOR EXPLICIT DEFINITIONS OF ALL TERMS.]