The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
SYNOPSIS

     use Data::Clean::JSON;
     my $cleanser = Data::Clean::JSON->get_cleanser;
     my $data     = { code=>sub {}, re=>qr/abc/i };
    
     my $cleaned;
    
     # modifies data in-place
     $cleaned = $cleanser->clean_in_place($data);
    
     # ditto, but deep clone first, return
     $cleaned = $cleanser->clone_and_clean($data);
    
     # now output it
     use JSON;
     print encode_json($cleaned); # prints '{"code":"CODE","re":"(?^i:abc)"}'

    Functional shortcuts:

     use Data::Clean::JSON qw(clean_json_in_place clone_and_clean_json);
    
     # equivalent to Data::Clean::JSON->get_cleanser->clean_in_place($data)
     clean_json_in_place($data);
    
     # equivalent to Data::Clean::JSON->get_cleanser->clone_and_clean($data)
     $cleaned = clone_and_clean_json($data);

DESCRIPTION

    This class cleans data from anything that might be problematic when
    encoding to JSON. This includes coderefs, globs, and so on. Here's what
    it will do by default:

      * Change DateTime and Time::Moment object to its epoch value

      * Change Regexp and version object to its string value

      * Change scalar references (e.g. \1) to its scalar value (e.g. 1)

      * Change other references (non-hash, non-array) to its ref() value
      (e.g. "GLOB", "CODE")

      * Clone circular references

      * Unbless other types of objects

    Cleaning recurses into objects.

    Data that has been cleaned will probably not be convertible back to the
    original, due to information loss (for example, coderefs converted to
    string "CODE").

    The design goals are good performance, good defaults, and just enough
    flexibility. The original use-case is for returning JSON response in
    HTTP API service.

    This module is significantly faster than modules like Data::Rmap or
    Data::Visitor::Callback because with something like Data::Rmap you
    repeatedly invoke callback for each data item. This module, on the
    other hand, generates a cleanser code using eval(), using native Perl
    for() loops.

    If LOG_CLEANSER_CODE environment is set to true, the generated cleanser
    code will be logged using Log::Any at trace level. You can see it, e.g.
    using Log::Any::App:

     % LOG=1 LOG_CLEANSER_CODE=1 TRACE=1 perl -MLog::Any::App -MData::Clean::JSON \
       -e'$c=Data::Clean::JSON->new; ...'

FUNCTIONS

    None of the functions are exported by default.

 clean_json_in_place($data)

    A shortcut for:

     Data::Clean::JSON->get_cleanser->clean_in_place($data)

 clone_and_clean_json($data) => $cleaned

    A shortcut for:

     $cleaned = Data::Clean::JSON->get_cleanser->clone_and_clean($data)

METHODS

 CLASS->get_cleanser => $obj

    Return a singleton instance, with default options. Use new() if you
    want to customize options.

 CLASS->new() => $obj

    Create a new instance.

 $obj->clean_in_place($data) => $cleaned

    Clean $data. Modify data in-place.

 $obj->clone_and_clean($data) => $cleaned

    Clean $data. Clone $data first.

ENVIRONMENT

    LOG_CLEANSER_CODE

FAQ

 Why clone/modify? Why not directly output JSON?

    So that the data can be used for other stuffs, like outputting to YAML,
    etc.

 Why is it slow?

    If you use new() instead of get_cleanser(), make sure that you do not
    construct the Data::Clean::JSON object repeatedly, as the constructor
    generates the cleanser code first using eval(). A short benchmark (run
    on my slow Atom netbook):

     % bench -MData::Clean::JSON -b'$c=Data::Clean::JSON->new' \
         'Data::Clean::JSON->new->clone_and_clean([1..100])' \
         '$c->clone_and_clean([1..100])'
     Benchmarking sub { Data::Clean::JSON->new->clean_in_place([1..100]) }, sub { $c->clean_in_place([1..100]) } ...
     a: 302 calls (291.3/s), 1.037s (3.433ms/call)
     b: 7043 calls (4996/s), 1.410s (0.200ms/call)
     Fastest is b (17.15x a)

    Second, you can turn off some checks if you are sure you will not be
    getting bad data. For example, if you know that your input will not
    contain circular references, you can turn off circular detection:

     $cleanser = Data::Clean::JSON->new(-circular => 0);

    Benchmark:

     $ perl -MData::Clean::JSON -MBench -E '
       $data = [[1],[2],[3],[4],[5]];
       bench {
           circ   => sub { state $c = Data::Clean::JSON->new;               $c->clone_and_clean($data) },
           nocirc => sub { state $c = Data::Clean::JSON->new(-circular=>0); $c->clone_and_clean($data) }
       }, -1'
     circ: 9456 calls (9425/s), 1.003s (0.106ms/call)
     nocirc: 13161 calls (12885/s), 1.021s (0.0776ms/call)
     Fastest is nocirc (1.367x circ)

    The less number of checks you do, the faster the cleansing process will
    be.

 Why am I getting 'Not a CODE reference at lib/Data/Clean.pm line xxx'?

    [2013-08-07 ] This error message is from Data::Clone::clone() when it
    is cloning an object. If you are cleaning objects, instead of using
    clone_and_clean(), try using clean_in_place(). Or, clone your data
    first using something else like Sereal.

SEE ALSO

    Data::Rmap

    Data::Visitor::Callback

    Data::Abridge is similar in goal, which is to let Perl data structures
    (which might contain stuffs unsupported in JSON) be encodeable to JSON.
    But unlike Data::Clean::JSON, it has some (currently) non-configurable
    rules, like changing a coderef with a hash {CODE=>'\&main::__ANON__'}
    or a scalar ref with {SCALAR=>'value'} and so on. Note that the
    abridging process is similarly unidirectional (you cannot convert back
    the original Perl data structure).