The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
package SQL::Interpolate::Filter;

use strict;
use warnings;
use Filter::Simple;
use Text::Balanced qw/extract_quotelike
                      extract_bracketed
                      extract_multiple
                      extract_variable
                      extract_codeblock/;

our $VERSION = '0.33';

# Source filter.
# Note: this could be improved as done in the POD of the development 2.0 version of
# Text::Balanced.
FILTER {
    my $id = qr/\b(?!([ysm]|q[rqxw]?|tr)\b)\w+/;

    # This lexes the Perl source code, replacing quotelike sql//
    # operators with the result of _process_sql().
    while ($_ !~ /\G$/gc) {
        my $sql;
        my $last_pos = pos();
        if    (/\G\s+/gc) { }  # whitespace
        elsif (/\G#.*/gc) { }  # comments
        # sql// operators
        # FIX:should any other quote delimiters be added?
        elsif (/\G\bsql\b\s*(?=[\{\(\[\<\/])/gcs &&
            do {
                my $pos = pos();
                s/\G/ q/;  # convert to Perl quote-like
                pos() = $pos;
                $sql = (extract_quotelike())[5];
                #print "<sql:$sql>";
                if (!$sql) {  # restore
                    s/\G q//;
                    pos() = $pos;
                }
                !!$sql;
            }
        )
        {
            my $pos = pos();
            my $out = _process_sql($sql);
            pos() = $pos;
            substr($_, $last_pos, pos() - $last_pos) = $out;
            pos() = $last_pos + length($out);
        }
        # prevent things like $y = ... = from being interpreted as string.
        elsif (/\G(?<=[\$\@])\w+/gc) {
            #print "[DEBUG:var:$&]";
        }
        elsif (/\G$id/gc) {
            #print "[DEBUG:id:$&]";
        }
        elsif (my $next = (extract_quotelike())[0]) {
            #print "[DEBUG:q:$next]";
        }
        else {
            /\G./gc;
        }
    }
    print STDERR "DEBUG:filter[code=$_]" if $SQL::Interpolate::trace_filter;
};

# Convert the string inside a sql// quote-like operator into
# a list of SQL strings and variable references for interpolation.
sub _process_sql {
    local $_ = shift;

    my @parts;
    my $instr = 0;
    while ($_ !~ /\G$/gc) {
        my $tok;
        my $tok_type;
        my $pos_last = pos();
        if (/\G(\s+|\*)/gc) {
            $tok = $1;
            $tok_type = 's';
        }
        elsif ($tok = (extract_variable($_))[0]) {
            $tok_type = 'v';
        }
        elsif ($tok = (extract_codeblock($_, '{['))[0]) {
            $tok_type = 'c';
        }
        else {
            /\G(.)/gc;
            $tok = $1;
            $tok_type = 's';
        }

        if ($tok_type eq 's') {
            if ($instr) {
                $parts[-1] .= $tok
            }
            else {
                push @parts, $tok
            }
            $instr = 1;
        }
        else {
            $parts[-1] = 'qq[' . $parts[-1] . ']' if $instr;
            $instr = 0;
            if ($tok_type eq 'v') {
                push @parts, '\\' . $tok;
            }
            elsif ($tok_type eq 'c') {
                push @parts, $tok;
            }
            else { die 'assert'; }
        }

    }
    $parts[-1] = 'qq[' . $parts[-1] . ']' if $instr;

    my $out = 'SQL::Interpolate::Filter::_make_sql('
            . join(', ', @parts) . ')';

    return $out;
}

# Generated by the sql// operator when source filtering is enabled.
sub _make_sql {
    my (@list) = @_;

    # Note that sql[INSERT INTO mytable $x] gets translated to
    #   q[INSERT INTO mytable], \$x
    # regardless whether $x is a scalar or reference since it
    # would be difficult to know at source filtering time whether
    # $x is already a reference.  Therefore, we dereference any
    # double reference here (at run-time).
    do { $_ = $$_ if ref($_) eq 'REF' }
        for @list;

    my $o = SQL::Interpolate::SQL->new(@list);
    return $o;
}

1;

# Implementation Notes:
# Sub::Quotelike provides similar functionality to this module,
# but it is not exactly what I need.  Sub::Quotelike allows you to
# replace quote expressions with calls to your own custom function
# that can return itself and expression.  In Sub::Quotelike, the
# return expression is evaluated within the context of the called
# subroutine rather that in the scope of the caller as is typically
# the case with variable interpolation in strings.  Therefore, SQL
# variable interpolation will not work correctly.  Furthermore, the
# current version (0.03) performs fairly simple, and potentially
# error-prone, source filtering.

# We also do not utilize "FILTER_ONLY quotelike" in Filter::Simple
# since its parsing is fairly simplistic and recognizes things like $y
# = ... = as containing a quote (y=...=).

1;

__END__

=head1 NAME

SQL::Interpolate::Filter - Source filtering for SQL::Interpolate

=head1 SYNOPSIS

  # This first line enables source filtering.
  use SQL::Interpolate FILTER => 1, qw(:all);
  
  ($sql, @bind) = sql_interp sql[
      SELECT * FROM mytable WHERE color IN @colors
      AND y = $x OR {z => 3, w => 2}
  ];
  
  ($sql, @bind) = sql_interp sql[
      INSERT INTO table {
          color  => $new_color,
          shape  => $new_shape
          width  => $width,
          height => $height,
          length => $length
      }
  ];

  # Each result above is suitable for passing to DBI:
  my $res = $dbh->selectall_arrayref($sql, undef, @bind);

=head1 DESCRIPTION

This module adds source filtering capability to the
L<SQL::Interpolate|SQL::Interpolate> and
L<DBIx::Interpolate|DBIx::Interpolate> modules.  The source filtering
option provides Perl an additional quote-like operator (see L<perlop>)
denoted sql//.  The quote can contain SQL and Perl variables:

  sql/SELECT * FROM mytable WHERE x = $x/;

Source filtering will transform this construct into an sql() object
containing the filtered interpolation list:

  sql("SELECT * FROM mytable WHERE x = ", \$x);

which C<sql_interp> (or C<dbi_interp>) can then interpolate as usual:

  "SELECT * FROM mytable WHERE x = ?", ($x)

=head2 Usage

To enable the quote-like sql// operator, add a "FILTER => 1" to your use
statement:

  use SQL::Interpolate  FILTER => 1, qw(:all);  # or
  use DBIx::Interpolate FILTER => 1, qw(:all);

Just as it is possible to do with q// or qq// operators, you can use
various delimiters on the sql// operator, such as

  sql[SELECT * from mytable WHERE x = $x]
  sql(SELECT * from mytable WHERE x = $x)
  sql<SELECT * from mytable WHERE x = $x>
  sql/SELECT * from mytable WHERE x = $x/

sql() objects (and sql// string-like operators representing them) come
with a string concatenation operator (.), so you can do things like

  sql[
    SELECT partnum, desc, price, stock
    FROM inventory
  ] . $show_all ? sql[] : sql[WHERE price > $price AND stock IN $stocks]

=head2 Security notes

An sql// object concatenated with a string will append the string
verbatim into your result SQL:

  $dbx->do(sql[UPDATE mytable SET y = 0 WHERE x = ] . $name);  # not good

Future versions of SQL::Interpolate may throw an error if one attempts
to do this.  If you want the value to bind, you must interpolate:

  $dbx->do(sql[UPDATE mytable SET y = 0 WHERE x = ] . sql[$name]);  # or
  $dbx->do(sql[UPDATE mytable SET y = 0 WHERE x = $name]);

=head2 Examples

 INPUT:  sql[WHERE one=$x AND $y]
 OUTPUT: sql("WHERE one=", \$x, " AND ", $y)

 INPUT:  sql[INSERT INTO mytable @x]
 OUTPUT: sql("INSERT INTO mytable ", \@x)

 INPUT:  sql[INSERT INTO mytable [1, 2]]
 OUTPUT: sql("INSERT INTO mytable ", [1, 2])

 INPUT   sql[INSERT INTO mytable %x]
 OUTPUT: sql("INSERT INTO mytable ", \%x)

 INPUT:  sql[INSERT INTO mytable {one => 1, two => 2}]
 OUTPUT: sql("INSERT INTO mytable ", {one => 1, two => 2})

=head2 Exports and Use Parameters

=head3 TRACE_FILTER

To enable tracing on the source code filter, do

 use SQL::Interpolate TRACE_FILTER => 1, FILTER => 1;

The source code of the module after source filtering will be sent
to STDERR.

 ...
 SQL::Interpolate::Filter::_make_sql(qq[SELECT * FROM mytable WHERE x = ], \$x)
 ...

=head1 DEPENDENCIES

This module depends on SQL::Interpolate, Filter::Simple (any),
and Text::Balanced >= 1.87.

=head1 DESIGN NOTES

=head2 Limitations / characteristics

Source filtering is somewhat experimental and has the potential to
give unexpected results because lexing Perl is hard.  The module
relies on Text::Balanced for the lexing.  Even though Text::Balanced
makes a valiant effort, the task is difficult and the results not
always precise, especially for very obscure Perl constructs.  It
should work fine though on many things.  If in doubt, check the output
yourself by enabling the TRACE_FILTER option.

=head2 Proposed enhancements

Support Text::Balanced 2.0 and improved Perl lexing.

Should a distinction be made between q// v.s. qq// for sql//?
Which semantics should sql// have?

How should Perl variables containing SQL literals (rather than
than binding variables) be interpolated?  Maybe with a stringified
macro? e.g. sql/...LITERAL($x).../.  Should stringified macros
be allowed in the interpolated SQL literal? (probably no for security).

Variables inside a stringified macro probably prevents
the macro from being un-stringified.  e.g. sql/...MYMACRO($x).../
--> "...MYMACRO(", \$x, ")..."

=head1 LEGAL

Copyright (c) 2004-2005, David Manura.
This module is free software. It may be used, redistributed
and/or modified under the same terms as Perl itself.
See L<http://www.perl.com/perl/misc/Artistic.html>.

=head1 SEE ALSO

Other modules in this distribution:
L<SQL::Interpolate|SQL::Interpolate>,
L<SQL::Interpolate::Macro|SQL::Interpolate::Macro>,
L<DBIx::Interpolate|DBIx::Interpolate>.

Dependent: L<Text::Balanced|Text::Balanced>, L<Filter::Simple>.