The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
package Locale::Maketext::Utils::Phrase::Norm::Escapes;

use strict;
use warnings;

sub normalize_maketext_string {
    my ($filter) = @_;

    my $string_sr = $filter->get_string_sr();

    # This will handle previously altered characters like " in  the aggregate results
    if ( ${$string_sr} =~ s/(?:\\\[(.*?)\])/[comment,escaped sequence “~[$1~]”]/g ) {    # TODO: make the \[.*?\] regex/logic smarter since it will not work with ~[ but why would we do that right :) - may need to wait for phrase-as-class obj
        $filter->add_violation('Contains escape sequence');
    }

    if ( ${$string_sr} =~ s/(?:\\([^NUuxX]))/[comment,escaped sequence “$1”]/g ) {
        $filter->add_violation('Contains escape sequence');
    }

    return $filter->return_value;
}

1;

__END__

=encoding utf-8

=head1 Normalization

Detect escaped character sequences.

=head2 Rationale

An escaped character:

=over 4

=item * adds ambiguity

=item * is an indication of in-string-formatting (e.g. \n)

=item * relies on interpolation

The additional layer of complexity could hinder translators and thus makes room for lower quality translations.

Also, among other things, it could make key lookup erroneously fail.

=item * Indicates use of a markup character (e.g. "You are \"awesome\".") which should be done differently (e.g. since that is will break the syntax if used in an HTML tag title attribute).

=back

=head1 possible violations

If you get false positives then that only goes to help highlight how ambiguity adds to the reason to avoid non-bytes strings!

=over 4

=item Contains escape sequence

A sequence of \n will be replaced w/ [comment,escaped sequence “n”], \" [comment,escaped sequence “"”], etc

=back

=head1 possible warnings

None