The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Data::DynamicValidator - JPointer-like and Perl union for flexible perl data structures validation

VERSION

version 0.05

SYNOPSIS

 my $my_complex_config = {
    features => [
        "a/f",
        "application/feature1",
        "application/feature2",
    ],
    service_points => {
        localhost => {
            "a/f" => { job_slots => 3, },
            "application/feature1" => { job_slots => 5 },
            "application/feature2" => { job_slots => 5 },
        },
        "127.0.0.1" => {
            "application/feature2" => { job_slots => 5 },
        },
    },
    mojolicious => {
        hypnotoad => {
            pid_file => '/tmp/hypnotoad-ng.pid',
            listen  => [ 'http://localhost:3000' ],
        },
    },
 };

 use Data::DynamicValidator qw/validator/;
 use Net::hostent;

 my $errors = validator($cfg)->(
   on      => '/features/*',
   should  => sub { @_ > 0 },
   because => "at least one feature should be defined",
   each    => sub {
     my $f = $_->();
     shift->(
       on      => "//service_points/*/`$f`/job_slots",
       should  => sub { defined($_[0]) && $_[0] > 0 },
       because => "at least 1 service point should be defined for feature '$f'",
     )
   }
 )->(
   on      => '/service_points/sp:*',
   should  => sub { @_ > 0 },
   because => "at least one service point should be defined",
   each    => sub {
     my $sp;
     shift->report_error("SP '$sp' isn't resolvable")
       unless gethost($sp);
   }
 )->(
  on      => '/service_points/sp:*/f:*',
  should  => sub { @_ > 0 },
  because => "at least one feature under service point should be defined",
  each    => sub {
    my ($sp, $f);
    shift->(
      on      => "//features/`*[value eq '$f']`",
      should  => sub { 1 },
      because => "Feature '$f' of service point '$sp' should be decrlared in top-level features list",
    )
  },
  })->rebase('/mojolicious/hypnotoad' => sub {
    shift->(
      on      => '/pid_file',
      should  => sub { @_ == 1 },
      because => "hypnotoad pid_file should be defined",
    )->(
      on      => '/listen/*',
      should  => sub { @_ > 0 },
      because => "hypnotoad listening interfaces defined",
    );
  })->errors;

 print "all OK\n"
  unless(@$errors);

RATIONALE

There are complex data configurations, e.g. application configs. Not to check them on applicaiton startup is wrong, because of sudden unexpected runtime errors can occur, which are not-so-pleasent to detect. Write the code, that does full exhaustive checks, is boring.

This module tries to offer to use DLS, that makes data validation fun for developer yet understandable for the person, which provides the data.

DESCRIPTION

First of all, you should create Validator instance:

 use Data::DynamicValidator qw/validator/;

 my $data = { ports => [2222] };
 my $v = validator($data);

Then, actually do validation:

 $v->(
   on      => '/ports/*',
   should  => sub { @_ > 0 },
   because => 'At least one port should be defined at "ports" section',
 );

The on parameter defines the data path, via JSON-pointer like expression; the should parameter provides the closure, which will check the values gathered on via pointer. If the closure returns false, then the error will be recorded, with description, provided by because parameter.

To get the results of validation, you can call:

 $v->is_valid; # returns true, if there is no validation errors
 $v->errors;   # returns array reference, consisting of the met Errors

on/should parameters are convenient for validation of presense of something, but they aren't so handy in checking of individual values. It should be mentioned, that should closure, always takes an array of the selected by on, even if only one element has been selected.

To handle individual values in more convenient way the optional each parameter has been introduced.

 my $data = { ports => [2222, 3333] };
 $v->(
   on      => '/ports/*',
   should  => sub { @_ > 0 },
   because => 'At least one port should be defined at "ports" section',
   each    => sub {
     my $port = $_->();
     $v->report_error("All ports should be greater than 1000")
      unless $port > 1000;
   },
 );

So, report_error could be used for custom errors reporting on current path or current data value. The $_ is the an implicit alias or label to the last componenet of the current path, i.e. on our case the current path in each closure will be /ports/0 and /ports/1, so the $_ will be 0 and 1 respectively. To get the value of the label, you should "invoke" it, as showed previously. A label stringizes to the last data path component, e.g. to "0" and "1" respectively.

The each closure single argrument is the validator instance itself. The previous example could be rewriten with explicit label like:

 $v->(
   on      => '/ports/port:*',
   should  => sub { @_ > 0 },
   because => 'At least one port should be defined at "ports" section',
   each    => sub {
     my $port;
     my $port_value = $port->();
     shift->report_error("All ports should be greater than 1000")
      unless $port_value > 1000;
   },
 );

Providing aliases for array indices may be not so handy as for keys of hashes. Please note, that the label port was previously "declated" in on rule, and only then "injected" into $port variable in each closure.

Consider the following example:

 my $data = {
  ports => [2000, 3000],
  2000  => 'tcp',
  3000  => 'udp',
 };

Let's validate it. The validation rule sounds as: there is 'ports' section, where at least one port > 1000 should be declated, and then the same port should appear at top-level, and it should be either 'tcp' or 'upd' type.

 use List::MoreUtils qw/any/;

 my $errors = validator($data)->(
   on      => '/ports/*[value > 1000 ]',
   should  => sub { @_ > 0 },
   because => 'At least one port > 1000 should be defined in "ports" section',
   each    => sub {
     my $port = $_->();
     shift->(
       on      => "//*[key eq $port]",
       should  => sub { @_ == 1 && any { $_[0] eq $_ } (qw/tcp udp/)  },
       because => "The port $port should be declated at top-level as tcp or udp",
      )
   }
  )->errors;

As you probably noted, the the path expression contains two slashes at on rule inside each rule. This is required to search data from the root, because the current element is been set as base before calling each, so all expressions inside each are relative to the current element (aka base).

You can change the base explicit way via rebase method:

 my $data = {
    mojolicious => {
        hypnotoad => {
            pid_file => '/tmp/hypnotoad-ng.pid',
            listen  => [ 'http://localhost:3000' ],
        },
    },
 };

 $v->rebase('/mojolicious/hypnotoad' => sub {
    shift->(
      on      => '/pid_file',
      should  => sub { @_ == 1 },
      because => "hypnotoad pid_file should be defined",
    )->(
      on      => '/listen/*',
      should  => sub { @_ > 0 },
      because => "hypnotoad listening interfaces defined",
    );
 })->errors;

DATA PATH EXPRESSIONS

 my $data = [qw/a b c d e/];
 '/2'   # selects the 'c' value in $data array
 '/-1'  # selects the 'e' value in $data array

 $data = { abc => 123 };
 '/abc' # selects the '123' value in hashref under key 'abc'

 $data = {
   mojolicious => {
     hypnotoad => {
       pid_file => '/tmp/hypnotoad-ng.pid',
     }
   }
 };
 '/mojolicious/hypnotoad/pid_file'  # point to pid_file
 '//mojolicious/hypnotoad/pid_file' # point to pid_file (independently of current base)

 # Escaping by back-quotes sample
 $data => { "a/b" => { c => 5 } }
 '/`a/b`/c' # selects 5

 $data = {abc => [qw/a b/]};   # 1
 $data = {abc => { c => 'd'}}; # 2
 $data = {abc => 7};           # 3
 '/abc/*' # selects 'a' and 'b' in 1st case
          # the 'd' in 2nd case
          # the number 7 in 3rd case

 # Filtering capabilities samples:

 '/abc/*[size == 5]'    # filter array/hash by size
 '/abc/*[value eq "z"]' # filter array/hash by value equality
 '/abc/*[index > 5]'    # finter array by index
 '/abc/*[key =~ /def/]' # finter hash by key

DEBUGGING

You can set the DATA_DYNAMICVALIDATOR_DEBUG environment variable to get some advanced diagnostics information printed to "STDERR".

 DATA_DYNAMICVALIDATOR_DEBUG=1

METHODS

validate

Performs validation based on on, should, because and optional each parameters. Returns the validator itself ($self), to allow further chain invocations. The validation will not be performed, if some errors already have been detected.

It is recommended to use overloaded function call, instead of this method call. (e.g. $validator-(...)> instead of $validato-validate(...)> )

report_error

The method is used for custom errors reporing. It is mainly usable in each closure.

 validator({ ports => [1000, 2000, 3000] })->(
   on      => '/ports/port:*',
   should  => sub { @_ > 0 },
   because => "At least one listening port should be defined",
   each    => sub {
     my $port;
     my $port_value = $port->();
     shift->report_error("Port value $port_value isn't acceptable, because < 1000")
       if($port_value < 1000);
   }
 );

is_valid

Checks, whether validator already has errors

errors

Returns internal array of errors

rebase

Temporaly sets the new base to the specified route, and invokes the closure with the validator instance, i.e.

 $v->('/a' => $closure->($v))

If the data can't be found at the specified route, the closure is not invoked.

current_base

Returns the current base, which is set only inside rebase call or each closure. Returns undef is there is no current base.

FUNCTIONS

validator

The enter point for DynamicValidator.

 my $errors = validator(...)->(
   on => "...",
   should => sub { ... },
   because => "...",
 )->errors;

RESOURCES

AUTHOR

Ivan Baidakou <dmol@gmx.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2017 by Ivan Baidakou.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.