Steffen Ullrich > Net-IMP-0.59 > Net::IMP

Download:
Net-IMP-0.59.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 0.59   Source   Latest Release: Net-IMP-0.629

NAME ^

Net::IMP - Inspection and Modification Protocol

SYNOPSIS ^

    ######################################################################
    # implementation of plugin 
    ######################################################################

    package myIMP_Plugin;
    use base 'Net::IMP::Base';
    use Net::IMP;

    # plugin global methods
    # -------------------------------------------------------------------

    sub cfg2str { ... }       # create $string from %config
    sub str2cfg { ... }       # create %config from $string
    sub validate_cfg { ... }  # validate %config

    sub new_factory {         # creates factory object
        my ($class,%factory_args) = @_;
        ...
        return $factory;
    }

    # factory specific methods and calls
    # -------------------------------------------------------------------

    # used in default implementation of method interface
    sub INTERFACE {
        [ undef, [ IMP_PREPASS, IMP_ACCTFIELD ]]
    };

    sub new_analyzer {        # creates analyzer from factory
        my ($factory,%analyzer_args) = @_;
        my $analyzer = $class->SUPER::new_analyzer( %analyzer_args );
        # maybe prepass everything forever in both directions
        $analyzer->add_results(
            [ IMP_PREPASS, 0, IMP_MAXOFFSET ],  # for dir client->server
            [ IMP_PREPASS, 1, IMP_MAXOFFSET ];  # for dir server->client
        );
        return $analyzer;
    }

    # analyzer specific methods
    # -------------------------------------------------------------------

    # new data for analysis, $offset should only be set if there are gaps
    # (e.g. when we PASSed data with offset in the future)
    sub data {
        my ($analyzer,$dir,$data,$offset,$datatype) = @_;
        ... 
    }

    ######################################################################
    # use of plugin 
    ######################################################################
    package main;

    # check configuration, maybe use str2cfg to get config from string before
    if (my @err = myIMP_Plugin->validate_cfg(%config)) {
        die "@err"
    }

    # create single factory object for each configuration 
    my $factory = myIMP_Plugin->new_factory(%config);

    # enforce the interface the caller will use, e.g. the input protocol/types
    # and the supported output return types
    $factory = $factory->set_interface([ 
        IMP_DATA_STREAM, 
        [ IMP_PASS, IMP_PREPASS, IMP_LOG ]
    ]) or die;

    # create analyzer object from factory for each new analysis (e.g. for
    # each connection)
    my $analyzer = $factory->new_analyzer(...);

    # set callback, which gets called on each result
    $analyzer->set_callback(\&imp_cb,$cb_arg);

    # feed analyzer with data
    $analyzer->data(0,'data from dir 0',0,IMP_DATA_STREAM);
    .... will call imp_cb as soon as results are there ...
    $analyzer->data(0,'',0,IMP_DATA_STREAM); # eof from dir 0

    # callback for results
    sub imp_cb {
        my $cb_arg = shift;
        for my $rv (@_) {
            my $rtype = shift(@$rv);
            if ( $rtype == IMP_PASS ) ...
            ...
        }
    }

    ######################################################################
    # definition of new data types suites
    ######################################################################
    package Net::IMP::HTTP;
    use Net::IMP 'IMP_DATA';
    use Exporter 'import';
    our @EXPORT = IMP_DATA('http',
        'header' => +1,   # packet type
        'body'   => -2,   # streaming type
        ...
    );

DESCRIPTION ^

IMP is a protocol for inspection, modification and rejection of data between two sides (client and server) using an analyzer implementing this interface.

Basics

IMP is an asynchronous protocol, usually used together with callbacks.

Usage of Terms

Factory

The factory object is used to create analyzers with common properties.

Analyzer

The analyzer is the object which does the analysis of the data within a specific context. It will be created by the factory for a new context.

Context

The context is the environment where the analyzer executes. E.g. when analyzing TCP connections, a new context is created for each TCP connection.

Interface

The interface consists of the data protocols/types (e.g. stream, packet, http...) supported by the analyzer and the return types (IMP_PASS, IMP_PREPASS, IMP_LOG, ...).

Result Types

The results returned inside the callback or via poll_results can be of the following kind:

[ IMP_PASS, $dir, $offset ]

Accept all data up to $offset in the data stream for direction $dir.

If $offset specifies data which were not yet seen by the analyzer, these data don't need to be forwarded to analyzer. If they were still forwarded to the analyzer (because they were already on the way, unstoppable) the analyzer just throws them away until $offset is reached. This feature is useful for ignoring whole subcontexts (like MIME content based on a Content-length header).

A special case is a $offset of IMP_MAXOFFSET, in this case the analyzer is not interested in further information about the connection.

[ IMP_PASS_PATTERN, $dir, $regex, $len ]

This is the same as IMP_PASS, except a pattern will be given instead of an offset. All data up to but not including the pattern don't need to be forwarded to the analyzer. Because $regex might be complex the analyzer has to specify how many octets the $regex might match at most, so that the caller can adjust its buffer.

Because there might be data already on the way to the analyzer, the analyzer needs to check all incoming data without explicit offset if they match the pattern. If it gets data with explicit offset, that means, that the pattern was matched inside the client at the specified position. In this case it should remove all data it got before (even if they included offset already) and resync at the specified offset.

For better performance the analyzer should check any data it has already in the buffer if they already contain the pattern. In this case the issue can be dealt internally and there is no need to send this reply to the caller.

If the caller receives this reply, it should check all data it has still in the buffer (e.g. which were not passed) wether they contain the pattern. If the caller finds the pattern, it should call data with an explicit offset, so that the analyzer can resynchronize the position in the data stream.

[ IMP_PREPASS, $dir, $offset ]

This is similar to IMP_PASS. If <$offset> specifies data, which were already forwarded to the analyzer, they get accepted. If it specified not yet forwarded data, they get accepted also up to $offset, but contrary to IMP_PASS they get also forwarded to the analyzer.

Thus data can be forwarded before they get inspected, but they get inspected nevertheless. This might be known good data, but inspection is needed to maintain the state or to log the data.

Or it might be potentially bad data, but a low latency is required and small amounts of bad data are accepted. In this case the window for bad data might be set small enough to allow high latency while limiting impact of malicious data. This can be done through continues updates of $offset.

[ IMP_DENY, $dir, $reason ]

Deny any more data on this context. If $reason is given, it should be used to construct a message to the client.

Deny results by closing the context in a way visible to the client (e.g. closing the connection with RST).

[ IMP_DROP ]

Deny any more data on this context and close the context. The preferred way for closing the context is to be not visible to the client (e.g just drop any more packets of an UDP connection).

[ IMP_REPLACE, $dir, $offset, $data ]

Ignore the original data up to $offset, instead send $data. $offset needs be be in the range of the data the analyzer got through data method.

[ IMP_TOSENDER, $dir, $data ]

Send data back to the sender. This might be used to reject data, e.g. replace them with nothing and send an error message back to the sender. This can be useful to reject single commands in SMTP, FTP...

[ IMP_LOG, $dir, $offset, $len, $level, $msg ]

This contains a log message $msg which is about data in direction $dir starting with $offset and $len octets long. $level might specify a log level like debug, info, warn... .

The caller should just log the information in this case.

$level is one of LOG_IMP_*, which are similar to syslog levels, e.g. IMP_LOG_DEBUG, IMP_LOG_INFO,... These level constants can be imported with use Net::IMP ':log'.

[ IMP_PORT_OPEN|IMP_PORT_CLOSE, $dir, $offset, ... ]

Some protocols like FTP, SIP, H.323 dynamically allocate ports. These results detect when port allocation/destruction is done and should provide enough information for the caller to open/close the ports and track the data through additional analyzers.

TODO: details will be specified when this feature is needed.

[ IMP_ACCTFIELD, $key, $value ]

This specifies a tuple which should be used for accounting (like name of logfile, URL...)

API Definition

The following API needs to be implemented by all IMP plugins. $class, $factory and $analyzer in the following might be (objects of) different classes, but don't need to. The Net::IMP::Base implementation uses the same class for plugin, factory and analyzer.

$class->str2cfg($string) => %config

This creates a config hash from a given string. No verification of the config is done.

$class->cfg2str(%config) => $string

This creates a string from a config hash. No verification of the config is done.

$class->validate_cfg(%config) -> @error

This validates the config and returns a list of errors. Config is valid, if no errors are returned.

$class->new_factory(%args) => $factory

This creates a new factory object which is later used to create the analyzer. %args are used to describe the properties common for all analyzers created by the same factory.

$factory->get_interface(@caller_if) => @plugin_if

This gets the interfaces supported by the factory. Each interface consists of [ $input_type, \@output_types ], where

$input_type

is either a single input data type (like IMP_DATA_STREAM, IMP_DATA_PACKET) or a protocol type (like IMP_DATA_HTTP) which includes multiple data types.

@output_types

is a list of the return types, which are used by the interface, e.g. IMP_PASS, IMP_LOG,... if \@output_types is not given or an empty list, it will be assumed, that the caller supports any return types.

If called without arguments the method will return all the interfaces supported by the factory. Only in this case an interface description with no data type/protocol might be returned, which means, that all data types/protocols are supported.

If called with a list of interfaces the caller supports, it will return the subset of these interfaces, which are also supported by the plugin.

$factory->set_interface($want_if) => $new_factory

This will return a factory object supporting the given interface. This factory might be the same as as original factory, but might also be a different factory, which translates data types.

If the interface is not supported it will return undef.

$factory->new_analyzer(%args) => $analyzer|undef

Creates a new analyzer object. The details for %args depend on the analyzed protocol and the requirements of the analyzer, but usually these are things like source and destination ip and port, URL, mime type etc.

With a key of cb the callback can already be set here as <[$code,@args]> instead of later with set_callback.

The factory might decide based on the given context information, that no analysis is needed. In this case it will return undef, otherwise the new analyzer object.

$analyzer->set_callback($code,@args)

Sets or changes the callback of the analyzer object. If results are outstanding, they might be delivered to this callback before the method returns.

$code is a coderef while @args are additional user specified arguments which should be used in the callback (typically object reference or similar). The callback is called with $code->(@args,@results) whenever new results are available.

If $code is undef, an existing callback will be removed.

If no callback is given, the results need to be polled with poll_results.

$analyzer->data($dir,$data,$offset,$type)

Forwards new data to the analyzer. $dir is the direction, e.g. 0 from client and 1 from server. $data are the data. $data of '' means end of data.

$offset is the current position (octet) in the data stream. It must be set to a value greater than 0 after data got omitted as a result of PASS or PASS_PATTERN, so that the analyzer can resynchronize the internal position value with the original position in the data stream. In any other case it should be set to 0.

$type is the type of the data. There are two global data type definitions:

IMP_DATA_STREAM (-1)

This is for generic streaming data, e.g. chunks from these datatypes can be concatinated and analyzed together, parts can be replaced etc.

IMP_DATA_PACKET (+1)

This is for generic packetized data, where each chunk (e.g. call to data) contains a single packet, which should be analyzed as a separate entity. This means no concatinating with previous or future chunks and no replacing of only parts of the packet.

Also, any offsets given in calls to data or in the results should be at packet boundary (or IMP_MAX_OFFSET), at least for data modifications. It will ignore (pre)pass which are not a packet boundary in the hope, that more (pre)pass will follow. A (pre)pass for some parts of a packet followed by a replacement is not allowed and will probably cause an exception.

All other data types are considered either subtypes of IMP_DATA_PACKET (value >0) or of IMP_DATA_STREAM (value<0) and share their restrictions. Also only streaming data of the same type can be concatinated and analyzed together.

Results will be delivered through the callback or via poll_results.

$analyzer->poll_results => @results

Returns outstanding results. If a callback is attached, no results will be delivered this way.

Net::IMP->set_debug

This is just a convinient way to call Net::IMP::Debug->set_debug. See Net::IMP::Debug for more information.

TODO ^

Helper Functions

The function IMP_DATA is provided to simplify definition of new data types, for example:

    our @EXPORT = IMP_DATA('http',
        'header'  => +1,   # packet type
        'body'    => -2,   # streaming type
        ...
    );
    push @EXPORT = IMP_DATA('httprq[http+10]',
        'header'  => +1,   # packet type
        'content' => -2,   # streaming type
        ...
    );

This call of IMP_DATA is equivalent to the following perl declaration:

    use Scalar::Util 'dualvar';
    our @EXPORT = (
        'IMP_DATA_HTTP', 'IMP_DATA_HTTP_HEADER','IMP_DATA_HTTP_BODY',...
        'IMP_DATA_HTTPRQ', 'IMP_DATA_HTTPRQ_HEADER','IMP_DATA_HTTPRQ_BODY',...
    );

    # getservbyname('http','tcp') -> 80
    use constant IMP_DATA_HTTP           
        => dualvar(80 << 16,'imp.data.http');
    use constant IMP_DATA_HTTP_HEADER    
        => dualvar((80 << 16) + 1,'imp.data.http.header');
    use constant IMP_DATA_HTTP_BODY      
        => dualvar( -( (80 << 16) + 2 ), 'imp.data.http.body');
    ...
    use constant IMP_DATA_HTTPRQ         
        => dualvar((80 << 16) + 10,'imp.data.httprq');
    use constant IMP_DATA_HTTPRQ_HEADER  
        => dualvar((80 << 16) + 10 + 1,'imp.data.httprq.header');
    use constant IMP_DATA_HTTPRQ_CONTENT 
        => dualvar( -( (80 << 16) + 10 + 2 ),'imp.data.httprq.content');
    ...

AUTHOR ^

Steffen Ullrich <sullr@cpan.org>

Thanks to everybody who helped with time, ideas, reviews or bug reports, notably Alexander Bluhm and others at genua.de.

COPYRIGHT ^

Copyright 2012,2013 Steffen Ullrich.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

syntax highlighting: