The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Net::IMP::ProtocolPinning - IMP plugin for simple protocol matching

SYNOPSIS

    my $factory = Net::IMP::ProtocolPinning->new_factory( rules => [
        # HTTP request from client (dir=0)
        [ 0,9,qr{(GET|POST|OPTIONS) \S} ],
    ]);

    my $factory = Net::IMP::ProtocolPinning->new_factory( rules => [
        # SSHv2 prompt from server
        [ 1,6,qr{SSH-2\.} ],
    ]);

    my $factory = Net::IMP::ProtocolPinning->new_factory(
        rules => [
            # SMTP initial handshake
            # greeting line from server
            { dir => 1, rxlen => 512, rx => qr{220 [^\n]*\n} },
            # HELO|EHLO from client
            { dir => 0, rxlen => 512, rx => qr{(HELO|EHLO)[^\n]*\n}i },
            # response to helo|ehlo
            { dir => 1, rxlen => 512, rx => qr{250-?[^\n]*\n} },
        ],
        # some clients send w/o initially waiting for server
        ignore_order => 1,
        max_unbound => [ 1024,0 ],
    );

DESCRIPTION

Net::IMP::ProtocolPinning implements an analyzer for very simple protocol verification using rules with regular expressions. The idea is to only check the first data in the connection for protocol conformance and then let the rest through without further checks.

Calls to new_factory or new_analyzer can contain the following arguments specific to this module:

rules ARRAY

Specifies the rules to use for protocol verification. Rules are an array of direction specific rules, e.g. each rule consists of [dir,rxlen,rx] with

dir

the direction, e.g. 0 for data from client and 1 for data from server

rxlen

the length of data the regular expression might need for the match. E.g. if the regex is qr/foo(?=bar)/ 6 bytes are needed for a successful match, even if the regex matches only 3 bytes.

rx

the regular expression itself. The regex will be applied against the not-yet-forwarded data with an implicit \A in front, so look-behind will not work.

ignore_order BOOLEAN

If true, it will take the first rule for direction, when data for connection arrive. If false, it will cause DENY if data arrive from one direction, but the current rule is for the other direction.

max_unbound [SIZE0,SIZE1]

If there are no more active rules for direction, and ignore_order is true, then the application needs to buffer data, until all remaining rules for the other direction are matched. Using this parameter the amount of buffered data which cannot be bound to a rule will be limited per direction.

If not set a default of unlimited will be used!

Process of Matching Input Against Rules

  • When new data arrive from direction and ignore_order is false, it will take the first active rule and compare the direction of the data with the direction of the rule. If they don't match it will be considered a protocol violation and a DENY will be issued.

    When new data arrive from direction, but ignore_order is true, it will pick the first active rule for this direction.

  • If no rule is found for direction, no action will be taken. This causes the data to be buffered in the application and they will only be released, once all rules have been processed.

    To limit the amount of buffered data in this case max_unbound should be set. Buffering more data than max_unbound for this direction will cause a DENY.

  • A rule was found. It will add the new data to the local buffer for the direction and then try to match the first rxlen bytes of the buffer against the rule. The regex of the rule is implicitly anchored at the beginning of the buffer.

  • If the rule matched and cannot match more (rxlen reached), it will

    • remove the rule from the list of active rules

    • remove the matched data from the local buffer

    • issue a PASS for the matched data

    • continue with the next active rule (if any)

    If the rule might still match more data, it will issue a PASS for the matched data, but wait with the other things until the rule is definitely done.

  • If the rule did not match, but the length of the local buffer is greater than or equal to rxlen, it will consider the rule failed and issue a DENY.

    If the rule did not match, but the buffer is smaller than rxlen, it will wait for more data and then try the match again.

  • If all rules matched (e.g. no more active rules), it will issue a PASS into the future until the end of the connection, causing all data to get forwarded without further analysis.

Rules for Writing the Regular Expressions

Because the match will be tried whenever new data come in (e.g. the buffer might have a size of less than, equal to or greater than rxlen), care should be taken, when constructing the regular expression and determining rxlen. It should not match data longer than rxlen, e.g. instead of specifying \d+ one should specify a fixed size with \d{1,10}.

Care should also be taken if you have consecutive rules for the same direction (e.g. either the next rule is for the same direction or ignore_order is true). Here you need to make sure, that the first rule will not match data needed by the next rule, e.g. \w{1,2} followed by \d will not work, while [a-z]{1,2} followed by \d will be fine.

Please note also, that the regular expression in the rule will be implicitly anchored at the beginning of the buffered data, e.g. \d will only match if the first character is a digit, not if any character but the first in the buffer is a digit. If you want the latter behavior, you have to explicitly allow other characters and need to limit their amount, e.g. "(?s).{0,10}\d".

AUTHOR

Steffen Ullrich <sullr@cpan.org>

COPYRIGHT

Copyright by Steffen Ullrich.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.