The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Regexp::Debugger - Visually debug regexes in-place

VERSION

This document describes Regexp::Debugger version 0.001003

SYNOPSIS

    use Regexp::Debugger;

DESCRIPTION

When you load this module, any regex in the same lexical scope will be visually (and interactively) debugged as it matches.

INTERFACE

The module itself provides no API. You load it and the debugger is automatically activated in that lexical scope.

The debugger offers the following commands:

?

: Print a help message listing these commands

s

: Step forwards

-

: Step backwards

m

: Continue forward to the next event that matches something

c

: Continue forward until the regex matches or completely backtracks

<RETURN>/<ENTER>

: Repeat the previous command

v

: Switch to visualization mode

h

: Switch to heatmapped visualization log

e

: Switch to the event log

j

: Switch to the underlying JSON data

V
H
H
J

: Take a snapshot of the corresponding display mode.

When prompted for a filename:

<RET>

...prints the snapshot to the terminal

<TAB>

...prints the snapshot to a file named "./rxrx_DISPLAY_MODE_TIMESTAMP"

Anything else

...prints the snapshot to that file

q

: Quit the debugger and finish matching without any further visualization

CONFIGURATION

You can configure the debugger by setting up a .rxrx file in in the current directory or in your home directory. This configuration consists of key:value pairs (everything else in the file is silently ignored).

Display mode configuration

If the 'display' key is specified, the debugger starts in that mode. The four available modes are:

    # Show dynamic visualization of matching (the default)...
    display : visual

    # Show dynamic heatmap visualization of matching...
    display : heatmap

    # Show multi-line matching event log...
    display : events

    # Show JSON encoding of matching process...
    display : JSON

Whitespace display configuration

Normally, the debugger compacts whitespaces in the regex down to a single space character, but you can configure that with the show_ws key:

    # Compact whitespace and comments to a single space (the default)...
    show_ws : compact

    # Compact whitespace, but show comments, newlines (\n), and tabs (\t)...
    show_ws : visible

    # Don't compact whitespace, and show newlines and tabs as \n and \t...
    show_ws : original

Colour configuration

The following keys reconfigure the colours with which the debugger displays text messages:

  • try_col

  • match_col

  • fail_col

  • info_col

The corresponding values are any combination of the following (i.e. the colour specifications supported by the Term::ANSIColor module):

         clear           reset             bold            dark
         faint           underline         underscore      blink
         reverse         concealed

         black           red               green           yellow
         blue            magenta           cyan            white
         bright_black    bright_red        bright_green    bright_yellow
         bright_blue     bright_magenta    bright_cyan     bright_white

         on_black        on_red            on_green        on_yellow
         on_blue         on_magenta        on_cyan         on_white
         on_bright_black on_bright_red     on_bright_green on_bright_yellow
         on_bright_blue  on_bright_magenta on_bright_cyan  on_bright_white

The default colour configuration is:

    try_col    :  bold magenta  on_black
    match_col  :  bold cyan     on_black
    fail_col   :       yellow   on_red
    info_col   :       white    on_black

Output configuration

Normally Regexp::Debugger sends its visualizations to the terminal and expects input from the same device.

However, you can configure the module to output its information (in standard JSON format) to a nominated file instead, using the 'save_to' option:

    save_to : filename_to_save_data_to.json

Data saved in this way may be re-animated using the rxrx utility, or by calling Regexp::Debugger::rxrx() directly. (See: "Command-line debugging" for details).

Configuration API

You can also configure the debugger on a program-by-program basis, by passing any of the above key/value pairs when the module is loaded.

For example:

    use Regexp::Debugger  fail => 'bold red',  whitespace => 'compact';

Note that any configuration specified in the user's .rxrx file is overridden by an explicit specification of this type.

The commonest use of this mechanism is to dump regex debugging information from an non-interactive program:

    use Regexp::Debugger  save_to => 'regex_debugged.json';

Note that, when 'save_to' is specified within a program, the value supplied does not have to be a string specifying the filename. You can also provide an actual filehandle (or equivalent). For example:

    use Regexp::Debugger save_to => IO::Socket::INET->new(
                                        Proto     => "tcp",
                                        PeerAddr  => 'localhost:666',
                                    );

COMMAND-LINE DEBUGGING

The module provides a non-exported subroutine (rxrx()) that implements a useful command-line regex debugging utility.

The utility can be invoked with:

    perl -MRegexp::Debugger -E 'Regexp::Debugger::rxrx\(@ARGV\)'

which is usually aliased in the shell to rxrx (and will be referred to by that name hereafter).

Regex debugging REPL

When called without any arguments, rxrx initiates a simple REPL that allows the user to type in regexes and strings and debug matches between them:

  • Any line starting with a / is treated as a new regex to match with.

  • Any line starting with a ' or " is treated as a new string to match against.

  • Any line beginning with m causes the REPL to match the current regex against the current string, visualizing the match in the usual way.

  • Any line beginning with q causes the REPL to quit.

Debugging regexes from a dumped session

When called with a filename, rxrx first checks whether the file contains a JSON dump of a previous debugging, in which case it replays the visualization of that regex match interactively.

This is useful for debugging non-interactive programs where the 'save_to' option was used (see "Output configuration" and "Configuration API").

In this mode, all the features of the interactive debugger (as listed under "INTERFACE") are fully available: you can step forwards and backwards through the match, skip to the successful submatch or a breakpoint, swap visualization modes, and take snapshots.

Wrap-around regex debugging

When called with the name of a file that does not contain a JSON dump, rxrx attempts to execute the file as a Perl program, with Regexp::Debugger enabled at the top level. In other words:

    rxrx prog.pl

is a convenient shorthand for:

    perl -MRegexp::Debugger prog.pl

LIMITATIONS

/x-mode comments

The current implementation cannot always distinguish whether a regex has an external /x modifier (and hence, what whitespace and comment characters mean). Whitespace is handled correctly in either case, but comments are not.

When processing a # comment to end of line within a regex, the module currently assumes a /x is in effect at start of the regex. This will cause erroneous behaviour if an unescaped # is used in a non-/x regex. Note that this limitation is likely to be corrected in a future release.

This limitation does not affect the handling of comments in (?x:...) and (?-x:...) blocks within the regex. These are always correctly handled.

Multiple 'save_to' with the same target

At present, making the same file the target of two successive save_to requests causes the second JSON data structure to overwrite the first.

This limitation will be removed in a subsequent release (but this will certainly involve a small change to the structure of the JSON data that is written, even when only one save_to is specified).

Variable interpolations

The module handles the interpolation of strings correctly, expanding them in-place before debugging begins.

However, it currently does not correctly handle the interpolation of qr'd regexes. That is, this:

    use Regexp::Debugger;

    my $ident = qr{ [^\W\d]\w* }x;      # a qr'd regex...

    $str =~ m{ ($ident) : (.*) }xms;    # ...interpolated into another regex

does not work correctly...and usually will not even compile.

It is expected that this limitation will be removed in a future release, however it may only be possible to fix the problem for more recent versions of Perl (i.e. 5.14 and later) in which the regex engine is re-entrant.

DIAGNOSTICS

Odd number of configuration args after "use Regexp::Debugger"

The module expects configuration arguments (see "Configuration API") to be passed as key => value pairs. You passed something else.

Unknown 'show_ws' option: %s

The only valid options for the 'show_ws' configuration option are 'compact', 'visible', or 'original'. You specified something else (or misspelled one of the above).

Unknown 'display' option: %s

The only valid options for the 'display' configuration option are 'visual' or 'heatmap' or 'events' or 'JSON'. You specified something else (or misspelled one of the above).

Invalid 'save_to' option: %s (%s)

The value associated with the 'save_to' option is expected to be a filehandle opened for writing, or else a string containing the name of a file that can be opened for writing. You either passed an unopened filehandle, an unwritable filename, or something that wasn't a plausible file. Alternatively, if you passed a filepath, was the directory not accessible to, or writeable by, you?

DEPENDENCIES

This module only works with Perl 5.10.1 and later.

The following modules are used when available:

Term::ANSIColor

Text colouring only works if this module can be loaded. Otherwise, all output will be monochromatic.

Win32::Console::ANSI

Under Windows, text colouring also requires that this module can be loaded. Otherwise, all output will be monochromatic.

File::HomeDir

If it can't find a useful value for $ENV{HOME}, Regexp::Debugger attempts to use this module to determine the user's home directory, in order to search for a .rxrx config file.

JSON::XS
JSON
JSON::DWIW
JSON::Syck

JSON output (i.e. for the 'save_to' option) is only possible if one of these modules can be loaded. Otherwise, all JSON output will default to an empty {}.

Term::ReadKey

Single-character interactions only work if this module can be loaded. Otherwise, all command interactions will require a <RETURN> after them.

Time::HiRes

Autogenerated timestamps (e.g. for snapshots) will only be sub-second accurate if this module can be loaded. Otherwise, all timestamps will only be to the nearest second.

INCOMPATIBILITIES

None reported, but this module will almost certainly not play nicely with any other that modifies regexes using overload::constant.

BUGS AND LIMITATIONS

No bugs have been reported.

Please report any bugs or feature requests to bug-regexp-debugger@rt.cpan.org, or through the web interface at http://rt.cpan.org.

AUTHOR

Damian Conway <DCONWAY@CPAN.org>

LICENCE AND COPYRIGHT

Copyright (c) 2011-2012, Damian Conway <DCONWAY@CPAN.org>. All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.

DISCLAIMER OF WARRANTY

BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.