Ben Bullock > JSON-Parse > JSON::Parse

Download:
JSON-Parse-0.41.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 0.41   Source  

NAME ^

JSON::Parse - Read JSON into a Perl variable

SYNOPSIS ^

    use JSON::Parse 'parse_json';
    my $json = '["golden", "fleece"]';
    my $perl = parse_json ($json);
    # Same effect as $perl = ['golden', 'fleece'];

Convert JSON into Perl.

DESCRIPTION ^

JSON::Parse offers the function "parse_json", which takes one argument, a string containing JSON, and returns a Perl reference or scalar. The input to parse_json must be a complete JSON structure.

JSON::Parse also offers two high-speed validation functions, "valid_json", which returns true or false, and "assert_valid_json", which produces a descriptive fatal error if the JSON is invalid. These are much faster than "parse_json". See "PERFORMANCE" for a comparison.

JSON::Parse also offers one convenience function to read JSON directly from a file, "json_file_to_perl", and a safer version of "parse_json" called "parse_json_safe" which doesn't throw exceptions.

For special cases, such as JSON objects with non-unique names (key collisions), or round-trips with JSON booleans, there are also "new" and "run", which create a JSON parsing object and run it on text.

JSON::Parse accepts only UTF-8 as input. If its input is marked as Unicode characters, the strings in its output are also marked as Unicode characters. If its input contains Unicode escapes of the form "\u3000", its output is upgraded to Unicode character strings.

(JSON means "JavaScript Object Notation" and it is specified in "RFC 7159".)

FUNCTIONS ^

parse_json

    use JSON::Parse 'parse_json';
    my $perl = parse_json ('{"x":1, "y":2}');

This function converts JSON into a Perl structure, either an array reference, a hash reference, or a scalar.

If the first argument does not contain a complete valid JSON text, parse_json throws a fatal error ("dies"). If the first argument is the undefined value, an empty string, or a string containing only whitespace, parse_json returns the undefined value.

If the argument contains valid JSON, the return value is either a hash reference, an array reference, or a scalar. If the input JSON text is a serialized object, a hash reference is returned:

    use JSON::Parse ':all';
    my $perl = parse_json ('{"a":1, "b":2}');
    print ref $perl, "\n";
    # Prints "HASH".

If the input JSON text is a serialized array, an array reference is returned:

    use JSON::Parse ':all';
    my $perl = parse_json ('["a", "b", "c"]');
    print ref $perl, "\n";
    # Prints "ARRAY".

Otherwise a Perl scalar is returned.

The behaviour of allowing a scalar was added in version 0.32 of this module. This brings it into line with the new specification for JSON.

The function "parse_json_safe" offers a version of this function with various safety features enabled.

json_file_to_perl

    use JSON::Parse 'json_file_to_perl';
    my $p = json_file_to_perl ('filename');

This is exactly the same as "parse_json" except that it reads the JSON from the specified file rather than a scalar. The file must be in the UTF-8 encoding, and is opened as a character file using :encoding(UTF-8) (see PerlIO::encoding and perluniintro for details). The output is marked as character strings.

This is a convenience function written in Perl. You may prefer to read the file yourself using another module if you need faster performance.

valid_json

    use JSON::Parse 'valid_json';
    if (valid_json ($json)) {
        # do something
    }

valid_json returns 1 if its argument is valid JSON and 0 if not. It also returns 0 if the input is undefined or the empty string.

This is a high-speed validator which runs between roughly two and eight times faster than "parse_json". This speed gain is obtained by discarding inputs after reading them rather than storing them into Perl variables.

valid_json does not supply the actual errors which caused invalidity. Use "assert_valid_json" to get error messages when the JSON is invalid.

This cannot detect key collisions in the JSON since it does not store values. See "Key collisions" for more on this module's handling of non-unique names in the JSON.

assert_valid_json

    use JSON::Parse 'assert_valid_json';
    eval {
        assert_valid_json ('["xyz":"b"]');
    };
    if ($@) {
        print "Your JSON was invalid: $@\n";
    }
    # Prints "Unexpected character ':' parsing array"

This is the underlying function for "valid_json". It runs at the same high speed, but throws an error if the JSON is wrong, rather than returning 1 or 0. See "DIAGNOSTICS" for the error format, which is identical to "parse_json".

This cannot detect key collisions in the JSON since it does not store values. See "Key collisions" for more on this module's handling of non-unique names in the JSON.

parse_json_safe

This is almost the same thing as "parse_json", but has the following differences:

Does not throw exceptions

If the JSON is invalid, a warning is printed and the undefined value is returned, as if calling "parse_json" like

    eval {
        parse_json ($json);
    };
    if ($@) {
        warn $@;
    }
Detects key collisions

This switches on "detect_collisions", so that if the JSON contains non-unique names, a warning is printed and the undefined value is returned.

Booleans are not read-only

This switches on "copy_literals" so that JSON true, false and null values are copied. These values can be modified, but they will not be converted back into true and false by JSON::Create.

As the name implies, this is meant to be a "safety-first" version of "parse_json".

This function was added in version 0.38.

OLD INTERFACE ^

The following alternative function names are accepted. These are the names used for the functions in old versions of this module. These names are not deprecated and will never be removed from the module.

json_to_perl

This is exactly the same function as "parse_json".

validate_json

This is exactly the same function as "assert_valid_json".

Mapping from JSON to Perl ^

JSON elements are mapped to Perl as follows:

JSON numbers

JSON numbers become Perl numbers, either integers or double-precision floating point numbers, or possibly strings containing the number if parsing of a number by the usual methods fails somehow.

JSON does not allow leading zeros, like 0123, or leading plus signs, like +100, in numbers, so these cause an "Unexpected character" error. JSON also does not allow numbers of the form 1., but it does allow things like 0e0 or 1E999999. As far as possible these are accepted by JSON::Parse.

JSON strings

JSON strings become Perl strings. The JSON escape characters such as \t for the tab character (see section 2.5 of "RFC 7159") are mapped to the equivalent ASCII character.

Handling of Unicode

If the input to "parse_json" is marked as Unicode characters, the output strings will be marked as Unicode characters. If the input is not marked as Unicode characters, the output strings will not be marked as Unicode characters. Thus,

    use JSON::Parse ':all';
    # The scalar $sasori looks like Unicode to Perl
    use utf8;
    my $sasori = '["蠍"]';
    my $p = parse_json ($sasori);
    print utf8::is_utf8 ($p->[0]);
    # Prints 1.

but

    use JSON::Parse ':all';
    # The scalar $ebi does not look like Unicode to Perl
    no utf8;
    my $ebi = '["海老"]';
    my $p = parse_json ($ebi);
    print utf8::is_utf8 ($p->[0]);
    # Prints nothing.

Escapes of the form \uXXXX (see page three of "RFC 7159") are mapped to ASCII if XXXX is less than 0x80, or to UTF-8 if XXXX is greater than or equal to 0x80.

Strings containing \uXXXX escapes greater than 0x80 are also upgraded to character strings, regardless of whether the input is a character string or a byte string, thus regardless of whether Perl thinks the input string is Unicode, escapes like \u87f9 are converted into the equivalent UTF-8 bytes and the particular string in which they occur is marked as a character string:

    use JSON::Parse ':all';
    no utf8;
    # 蟹
    my $kani = '["\u87f9"]';
    my $p = parse_json ($kani);
    print "It's marked as a character string" if utf8::is_utf8 ($p->[0]);
    # Prints "It's marked as a character string" because it's upgraded
    # regardless of the input string's flags.

This is modelled on the behaviour of Perl's chr:

    no utf8;
    my $kani = '87f9';
    print "hex is character string\n" if utf8::is_utf8 ($kani);
    # prints nothing
    $kani = chr (hex ($kani));
    print "chr makes it a character string\n" if utf8::is_utf8 ($kani);
    # prints "chr makes it a character string"

Since every byte of input is validated as UTF-8 (see "UTF-8 only"), this hopefully will not upgrade invalid strings.

Surrogate pairs in the form \uD834\uDD1E are also handled. If the second half of the surrogate pair is missing, an "Unexpected character" or "Unexpected end of input" error is thrown. If the second half of the surrogate pair is present but contains an impossible value, a "Not surrogate pair" error is thrown.

JSON arrays

JSON arrays become Perl array references. The elements of the Perl array are in the same order as they appear in the JSON.

Thus

    my $p = parse_json ('["monday", "tuesday", "wednesday"]');

has the same result as a Perl declaration of the form

    my $p = [ 'monday', 'tuesday', 'wednesday' ];

JSON objects

JSON objects become Perl hashes. The members of the JSON object become key and value pairs in the Perl hash. The string part of each object member becomes the key of the Perl hash. The value part of each member is mapped to the value of the Perl hash.

Thus

    my $j = <<EOF;
    {"monday":["blue", "black"],
     "tuesday":["grey", "heart attack"],
     "friday":"Gotta get down on Friday"}
    EOF

    my $p = parse_json ($j);

has the same result as a Perl declaration of the form

    my $p = {
        monday => ['blue', 'black'],
        tuesday => ['grey', 'heart attack'],
        friday => 'Gotta get down on Friday',
    };

Key collisions

In the event of a key collision within the JSON object, something like

     my $j = '{"a":1, "a":2}';
     my $p = parse_json ($j);
     print $j->{a}, "\n";
     # Prints 2.

"parse_json" overwrites the first value with the second value. "parse_json_safe" halts and prints a warning. If you use "new" you can switch key collision on and off with the "detect_collisions" method.

The rationale for "parse_json" not to give warnings is that Perl doesn't give information about collisions when storing into hash values, and checking for collisions for every key will degrade performance for the sake of an unlikely occurrence.

Note that the JSON specification says "The names within an object SHOULD be unique." (see "RFC 7159", page 5), although it's not a requirement.

For performance, "valid_json" and "assert_valid_json" do not store hash keys, thus they cannot detect this variety of problem.

Literals

null

"parse_json" maps the JSON null literal to a readonly scalar $JSON::Parse::null which evaluates to undef. "parse_json_safe" maps the JSON literal to the undefined value. If you use a parser created with "new", you can choose either of these behaviours with "copy_literals", or you can tell JSON::Parse to put your own value in place of nulls using the "set_null" method.

true

"parse_json" maps the JSON true literal to a readonly scalar which evaluates to 1. "parse_json_safe" maps the JSON literal to the value 1. If you use a parser created with "new", you can choose either of these behaviours with "copy_literals", or you can tell JSON::Parse to put your own value in place of trues using the "set_true" method.

false

"parse_json" maps the JSON false literal to a readonly scalar which evaluates to the empty string, or to zero in a numeric context. (This behaviour changed from version 0.36 to 0.37. In versions up to 0.36, the false literal was mapped to a readonly scalar which evaluated to 0 only.) "parse_json_safe" maps the JSON literal to a similar scalar without the readonly constraints. If you use a parser created with "new", you can choose either of these behaviours with "copy_literals", or you can tell JSON::Parse to put your own value in place of falses using the "set_false" method.

Round trips and compatibility

The Perl versions of literals produced by "parse_json" will be converted back to JSON literals if you use JSON::Create's create_json. However, JSON::Parse's literals are incompatible with the other CPAN JSON modules. For compatibility with other CPAN modules, create a JSON::Parse object with "new", and set JSON::Parse's literals with "set_true", "set_false", and "set_null".

This example demonstrates round-trip compatibility using JSON::Tiny version 0.54:

    use JSON::Tiny '0.54', qw(decode_json encode_json);
    use JSON::Parse;
    use JSON::Create;
    sub l { print "\n@_:\n\n"; }
    sub i { print "    @_\n"; }
    my $cream = '{"clapton":true,"hendrix":false,"bruce":true,"fripp":false}';
    my $jp = JSON::Parse->new ();
    my $jc = JSON::Create->new ();
    l "First do a round-trip of our modules";
    i $jc->run ($jp->run ($cream));
    l "Now do a round-trip of JSON::Tiny";
    i encode_json (decode_json ($cream));
    l "First, incompatible mode";
    i 'tiny(parse):', encode_json ($jp->run ($cream));
    i 'create(tiny):', $jc->run (decode_json ($cream));
    l "Compatibility with JSON::Parse";
    $jp->set_true (JSON::Tiny::true);
    $jp->set_false (JSON::Tiny::false);
    i 'tiny(parse):', encode_json ($jp->run ($cream));
    l "Compatibility with JSON::Create";
    $jc->bool ('JSON::Tiny::_Bool');
    i 'create(tiny):', $jc->run (decode_json ($cream));
    l "JSON::Parse and JSON::Create are still compatible too";
    i $jc->run ($jp->run ($cream));
    exit;

The output looks like this:

First do a round-trip of our modules:

    {"bruce":true,"clapton":true,"fripp":false,"hendrix":false}

Now do a round-trip of JSON::Tiny:

    {"fripp":false,"bruce":true,"clapton":true,"hendrix":false}

First, incompatible mode:

    tiny(parse): {"hendrix":"","fripp":"","bruce":1,"clapton":1}
    create(tiny): {"bruce":1,"clapton":1,"fripp":0,"hendrix":0}

Compatibility with JSON::Parse:

    tiny(parse): {"fripp":false,"clapton":true,"bruce":true,"hendrix":false}

Compatibility with JSON::Create:

    create(tiny): {"bruce":true,"clapton":true,"fripp":false,"hendrix":false}

JSON::Parse and JSON::Create are still compatible too:

    {"hendrix":false,"bruce":true,"clapton":true,"fripp":false}

Most of the other CPAN modules use similar methods to JSON::Tiny, so the above example can easily be adapted. See also the documentation of JSON::Create under "Interoperability" for various examples.

Modifying the values

"parse_json" maps all the literals to read-only values. Because of this, attempting to modifying the boolean values in the hash reference returned by "parse_json" will cause "Modification of a read-only value attempted" errors:

    my $in = '{"hocus":true,"pocus":false,"focus":null}';
    my $p = json_parse ($in);
    $p->{hocus} = 99;
    # "Modification of a read-only value attempted" error occurs

Since the hash values are read-only scalars, $p->{hocus} = 99 is like this:

    undef = 99;

If you need to modify the returned hash reference, then delete the value first:

    my $in = '{"hocus":true,"pocus":false,"focus":null}';
    my $p = json_parse ($in);
    delete $p->{pocus};
    $p->{pocus} = 99;
    # OK

Similarly with array references, delete the value before altering:

    my $in = '[true,false,null]';
    my $q = json_parse ($in);
    delete $q->[1];
    $q->[1] = 'magic';

Note that the return values from parsing bare literals are not read-only scalars, so

    my $true = JSON::Parse::json_parse ('true');
    $true = 99;

produces no error. This is because Perl copies the scalar.

METHODS ^

If you need to parse JSON and you are not satisfied with the parsing options offered by "parse_json" and "parse_json_safe", you can create a JSON parsing object with "new" and set various options on the object, then use it with "run". These options include the ability to copy JSON literals with "copy_literals", switch off fatal errors with "warn_only", detect key collisions in objects with "detect_collisions", and set the JSON literals to user defined values with the methods described under "Methods for manipulating literals".

These methods only work on an object created with "new"; they do not affect the behaviour of "parse_json" or "parse_json_safe".

new

    my $jp = JSON::Parse->new ();

Create a new JSON::Parse object.

This method was added in version 0.38.

run

    my $out = $jp->run ($json);

Exactly the same thing as "parse_json", except its behaviour can be modified using the following methods.

This method was added in version 0.38.

copy_literals

    $jp->copy_literals (1);

With a true value, copy JSON literal values (null, true, and false) into new Perl scalar values, and don't put read-only values into the output.

With a false value, use read-only scalars:

    $jp->copy_literals (0);

The copy_literals (1) behaviour is the behaviour of "parse_json_safe". The copy_literals (0) behaviour is the behaviour of "parse_json".

If the user also sets user-defined literals with "set_true", "set_false" and "set_null", that takes precedence over this.

This method was added in version 0.38.

warn_only

    $jp->warn_only (1);

Warn, don't die, on error. Failed parsing returns the undefined value, undef, and prints a warning.

This can be switched off again using any false value:

    $jp->warn_only ('');

This method was documented in version 0.38, but only implemented in version 0.41.

detect_collisions

    $jp->detect_collisions (1);

This switches on a check for hash key collisions (non-unique names in JSON objects). If a collision is found, an error message "Name is not unique" is printed, which also gives the non-unique name and the byte position where the start of the colliding string was found:

    use JSON::Parse;
    my $jp = JSON::Parse->new ();
    $jp->detect_collisions (1);
    eval {
        $jp->run ('{"animals":{"cat":"moggy","cat":"feline","cat":"neko"}}');
    };
    print "$@\n" if $@;

produces

    JSON error at line 1, byte 28/55: Name is not unique: "cat" parsing object starting from byte 12 at examples/collide.pl line 8.

The detect_collisions (1) behaviour is the behaviour of "parse_json_safe". The detect_collisions (0) behaviour is the behaviour of "parse_json".

This method was added in version 0.38.

Methods for manipulating literals

These methods alter what is written into the Perl structure when the parser sees a literal value, true, false or null in the input JSON.

This number of methods is unfortunately necessary, since it's possible that a user might want to set_false (undef) to set false values to turn into undefs.

    $jp->set_false (undef);

Thus, we cannot use a single function $jp->false (undef) to cover both setting and deleting of values.

These methods were added in version 0.38.

set_true

    $jp->set_true ("Yes, that is so true");

Supply a scalar to be used in place of the JSON true literal.

This example puts the string "Yes, that is so true" into the hash or array when we hit a "true" literal, rather than the default read-only scalar:

    use JSON::Parse;
    my $json = '{"yes":true,"no":false}';
    my $jp = JSON::Parse->new ();
    $jp->set_true ('Yes, that is so true');
    my $out = $jp->run ($json);
    print $out->{yes}, "\n";

prints

    Yes, that is so true

To override the previous value, call it again with a new value. To delete the value and revert to the default behaviour, use "delete_true".

If you give this a value which is not "true", as in Perl will evaluate it as a false in an if statement, it prints a warning "User-defined value for JSON true evaluates as false". You can switch this warning off with "no_warn_literals".

This method was added in version 0.38.

delete_true

    $jp->delete_true ();

Delete the user-defined true value. See "set_true".

This method is "safe" in that it has absolutely no effect if no user-defined value is in place. It does not return a value.

This method was added in version 0.38.

set_false

    $jp->set_false (JSON::PP::Boolean::false);

Supply a scalar to be used in place of the JSON false literal.

In the above example, when we hit a "false" literal, we put JSON::PP::Boolean::false in the output, similar to JSON::PP and other CPAN modules like Mojo::JSON or JSON::XS.

To override the previous value, call it again with a new value. To delete the value and revert to the default behaviour, use "delete_false".

If you give this a value which is not "false", as in Perl will evaluate it as a false in an if statement, it prints a warning "User-defined value for JSON false evaluates as true". You can switch this warning off with "no_warn_literals".

This method was added in version 0.38.

delete_false

    $jp->delete_false ();

Delete the user-defined false value. See "set_false".

This method is "safe" in that it has absolutely no effect if no user-defined value is in place. It does not return a value.

This method was added in version 0.38.

set_null

    $jp->set_null (0);

Supply a scalar to be used in place of the JSON null literal.

To override the previous value, call it again with a new value. To delete the value and revert to the default behaviour, use "delete_null".

This method was added in version 0.38.

delete_null

    $jp->delete_null ();

Delete the user-defined null value. See "set_null".

This method is "safe" in that it has absolutely no effect if no user-defined value is in place. It does not return a value.

This method was added in version 0.38.

no_warn_literals

    $jp->no_warn_literals (1);

Use a true value to switch off warnings about setting boolean values to contradictory things. For example if you want to set the JSON false literal to turn into the string "false",

    $jp->no_warn_literals (1);
    $jp->set_false ("false");

See also "Contradictory values for "true" and "false"".

This also switches off the warning "User-defined value overrules copy_literals".

This method was added in version 0.38.

RESTRICTIONS ^

This module imposes the following restrictions on its input.

JSON only

JSON::Parse is a strict parser. It only accepts input which exactly meets the criteria of "RFC 7159". That means, for example, JSON::Parse does not accept single quotes (') instead of double quotes ("), or numbers with leading zeros, like 0123. JSON::Parse does not accept control characters (0x00 - 0x1F) in strings, missing commas between array or hash elements like ["a" "b"], or trailing commas like ["a","b","c",]. It also does not accept trailing non-whitespace, like the second "]" in ["a"]].

No incremental parsing

JSON::Parse does not parse incrementally. It only parses fully-formed JSON strings which include all opening and closing brackets. This is an inherent part of the design of the module. Incremental parsing in the style of XML::Parser would (obviously) require some kind of callback structure to deal with the elements of the partially digested structures, but JSON::Parse was never designed to do this; it merely converts what it sees into a Perl structure. Claims to offer incremental JSON parsing in other modules' documentation should be diligently verified.

UTF-8 only

Although JSON may come in various encodings of Unicode, JSON::Parse only parses the UTF-8 format. If input is in a different Unicode encoding than UTF-8, convert the input before handing it to this module. For example, for the UTF-16 format,

    use Encode 'decode';
    my $input_utf8 = decode ('UTF-16', $input);
    my $perl = parse_json ($input_utf8);

or, for a file, use :encoding (see PerlIO::encoding and perluniintro):

    open my $input, "<:encoding(UTF-16)", 'some-json-file'; 

JSON::Parse does not determine the nature of the octet stream, as described in part 3 of "RFC 7159".

This restriction to UTF-8 applies regardless of whether Perl thinks that the input string is a character string or a byte string. Non-UTF-8 input will cause an "Unexpected character" error to be thrown.

DIAGNOSTICS ^

"valid_json" does not produce error messages. "parse_json" and "assert_valid_json" die on encountering invalid input.

Error messages have the line number, and the byte number where appropriate, of the input which caused the problem. The line number is formed simply by counting the number of "\n" (linefeed, ASCII 0x0A) characters in the whitespace part of the JSON.

Parsing errors are fatal, so to continue after an error occurs, put the parsing into an eval block:

    my $p;                       
    eval {                       
        $p = parse_json ($j);  
    };                           
    if ($@) {                    
        # handle error           
    }

The following error messages are produced:

Unexpected character

An unexpected character (byte) was encountered in the input. For example, when looking at the beginning of a string supposedly containing JSON, there are six possible characters, the four JSON whitespace characters plus "[" and "{". If the module encounters a plus sign, it will give an error like this:

    assert_valid_json ('+');

gives output

    JSON error at line 1, byte 1/1: Unexpected character '+' parsing initial state: expecting whitespace: '\n', '\r', '\t', ' ' or start of string: '"' or digit: '0-9' or minus: '-' or start of an array or object: '{', '[' or start of literal: 't', 'f', 'n' 

The message always includes a list of what characters are allowed.

If there is some recognizable structure being parsed, the error message will include its starting point in the form "starting from byte n":

    assert_valid_json ('{"this":"\a"}');

gives output

    JSON error at line 1, byte 11/13: Unexpected character 'a' parsing string starting from byte 9: expecting escape: '\', '/', '"', 'b', 'f', 'n', 'r', 't', 'u' 

A feature of JSON is that parsing it requires only one byte to be examined at a time. Thus almost all parsing problems can be handled using the "Unexpected character" error type, including spelling errors in literals:

    assert_valid_json ('[true,folse]');

gives output

    JSON error at line 1, byte 8/12: Unexpected character 'o' parsing literal starting from byte 7: expecting 'a' 

and the missing second half of a surrogate pair:

    assert_valid_json ('["\udc00? <-- should be a second half here"]');

gives output

    JSON error at line 1, byte 9/44: Unexpected character '?' parsing unicode escape starting from byte 3: expecting '\' 

All kinds of errors can occur parsing numbers, for example a missing fraction,

    assert_valid_json ('[1.e9]');

gives output

    JSON error at line 1, byte 4/6: Unexpected character 'e' parsing number starting from byte 2: expecting digit: '0-9' 

and a leading zero,

    assert_valid_json ('[0123]');

gives output

    JSON error at line 1, byte 3/6: Unexpected character '1' parsing number starting from byte 2: expecting whitespace: '\n', '\r', '\t', ' ' or comma: ',' or end of array: ']' or dot: '.' or exponential sign: 'e', 'E' 

The error message is this complicated because all of the following are valid here: whitespace: [0 ]; comma: [0,1], end of array: [0], dot: [0.1], or exponential: [0e0].

These are all handled by this error. Thus the error messages are a little confusing as diagnostics.

Versions of this module prior to 0.29 gave more informative messages like "leading zero in number". (The messages weren't documented.) The reason to change over to the single message was because it makes the parsing code simpler, and because the testing code described in "TESTING" makes use of the internals of this error to check that the error message produced actually do correspond to the invalid and valid bytes allowed by the parser, at the exact byte given.

This is a bytewise error, thus for example if a miscoded UTF-8 appears in the input, an error message saying what bytes would be valid at that point will be printed.

    no utf8;
    use JSON::Parse 'assert_valid_json';
    
    # Error in first byte:
    
    my $bad_utf8_1 = chr (hex ("81"));
    eval { assert_valid_json ("[\"$bad_utf8_1\"]"); };
    print "$@\n";
    
    # Error in third byte:
    
    my $bad_utf8_2 = chr (hex ('e2')) . chr (hex ('9C')) . 'b';
    eval { assert_valid_json ("[\"$bad_utf8_2\"]"); };
    print "$@\n";

prints

    JSON error at line 1, byte 3/5: Unexpected character 0x81 parsing string starting from byte 2: expecting printable ASCII or first byte of UTF-8: '\x20-\x7f', '\xC2-\xF4' at examples/bad-utf8.pl line 10.
    
    JSON error at line 1, byte 5/7: Unexpected character 'b' parsing string starting from byte 2: expecting bytes in range 80-bf: '\x80-\xbf' at examples/bad-utf8.pl line 16.

Unexpected end of input

The end of the string was encountered before the end of whatever was being parsed was. For example, if a quote is missing from the end of the string, it will give an error like this:

    assert_valid_json ('{"first":"Suzuki","second":"Murakami","third":"Asada}');

gives output

    JSON error at line 1: Unexpected end of input parsing string starting from byte 47 

Not surrogate pair

While parsing a string, a surrogate pair was encountered. While trying to turn this into UTF-8, the second half of the surrogate pair turned out to be an invalid value.

    assert_valid_json ('["\uDC00\uABCD"]');

gives output

    JSON error at line 1: Not surrogate pair parsing unicode escape starting from byte 11 

Empty input

This error occurs for "assert_valid_json" when it's given an empty or undefined value. Given empty input, "parse_json" returns an undefined value rather than throwing an error.

Name is not unique

This error occurs when parsing JSON when the user has chosen "detect_collisions". For example an input like

    my $p = JSON::Parse->new ();
    $p->detect_collisions (1);
    $p->run ('{"hocus":1,"pocus":2,"hocus":3}');

gives output

    JSON error at line 1, byte 23/31: Name is not unique: "hocus" parsing object starting from byte 1 at blib/lib/JSON/Parse.pm line 101.

where the JSON object has two keys with the same name, hocus. The terminology "name is not unique" is from the JSON specification.

$json_diagnostics

Experimentally, there is a global variable $JSON::Parse::json_diagnostics, which, if true, causes errors to be output as JSON rather than text:

    $JSON::Parse::json_diagnostics = 1;
    assert_valid_json ("{'not':'valid'}");

This outputs the following:

    {"input length":15,"bad type":"object","error":"Unexpected character","bad byte position":2,"bad byte contents":39,"start of broken component":1,"valid bytes":[0,0,0,0,0,0,0,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}

That means that, in a string of length 15 bytes, the JSON component which looked like an object starting from byte 1 is broken at byte 2, because it has a bad character there of ascii 39 (a single quote mark), where the bytes allowed were as described in the array of valid bytes. valid_bytes is a 256-item array whose values are "true" for allowed bytes and "false" otherwise.

This is intended for people who want to make, say, a "broken JSON repair" module, so that they can analyze errors without having to parse the above kinds of diagnostic string. The contents of the JSON diagnostics are not currently documented and are subject to change, so please view the source code (file json-common.c) or see what the errors look like by adding incorrect JSON and viewing the results.

Contradictory values for "true" and "false"

User-defined value for JSON false evaluates as true

This happens if you set JSON false to map to a true value:

    $jp->set_false (1);

To switch off this warning, use "no_warn_literals".

This warning was added in version 0.38.

User-defined value for JSON true evaluates as false

This happens if you set JSON true to map to a false value:

    $jp->set_true (undef);

To switch off this warning, use "no_warn_literals".

This warning was added in version 0.38.

User-defined value overrules copy_literals

This warning is given if you set up literals with "copy_literals" then you also set up your own true, false, or null values with "set_true", "set_false", or "set_null".

This warning was added in version 0.38.

PERFORMANCE ^

On the author's computer, the module's speed of parsing is approximately the same as JSON::XS, with small variations depending on the type of input. For validation, "valid_json" is faster than any other module known to the author, and up to ten times faster than JSON::XS.

Some special types of input, such as floating point numbers containing an exponential part, like "1e09", seem to be about two or three times faster to parse with this module than with JSON::XS. In JSON::Parse, parsing of exponentials is done by the system's strtod function, but JSON::XS contains its own parser for exponentials, so these results may be system-dependent.

At the moment the main place JSON::XS wins over JSON::Parse is in strings containing escape characters, where JSON::XS is about 10% faster on the module author's computer and compiler. As of version 0.33, despite some progress in improving JSON::Parse, I haven't been able to fully work out the reason behind the better speed.

There is some benchmarking code in the github repository under the directory "benchmarks" for those wishing to test these claims. The script benchmarks/bench is an adaptation of the similar script in the JSON::XS distribution. The script pub-bench.pl runs the benchmarks and prints them out as POD.

The following benchmark tests used version 0.38 of JSON::Parse and version 3.01 of JSON::XS on Perl Version 18.2, compiled with Clang version 3.4.1 on FreeBSD 10.1. The files in the "benchmarks" directory of JSON::Parse. "short.json" and "long.json" are the benchmarks used by JSON::XS.

short.json
    Repetitions: 10 x 100 = 1000
    --------------+------------+------------+
    module        |      1/min |        min |
    --------------|------------|------------|
    JP::valid     | 838860.800 |  0.0000119 |
    JSON::Parse   | 277768.477 |  0.0000360 |
    JSON::XS      | 257319.264 |  0.0000389 |
    --------------+------------+------------+
long.json
    Repetitions: 10 x 100 = 1000
    --------------+------------+------------+
    module        |      1/min |        min |
    --------------|------------|------------|
    JP::valid     |  14009.031 |  0.0007138 |
    JSON::Parse   |   5047.905 |  0.0019810 |
    JSON::XS      |   5602.116 |  0.0017850 |
    --------------+------------+------------+
words-array.json
    Repetitions: 10 x 100 = 1000
    --------------+------------+------------+
    module        |      1/min |        min |
    --------------|------------|------------|
    JP::valid     | 287281.096 |  0.0000348 |
    JSON::Parse   |  32488.799 |  0.0003078 |
    JSON::XS      |  31441.559 |  0.0003181 |
    --------------+------------+------------+
exp.json
    Repetitions: 10 x 100 = 1000
    --------------+------------+------------+
    module        |      1/min |        min |
    --------------|------------|------------|
    JP::valid     | 133576.561 |  0.0000749 |
    JSON::Parse   |  52363.346 |  0.0001910 |
    JSON::XS      |  19803.135 |  0.0005050 |
    --------------+------------+------------+
literals.json
    Repetitions: 10 x 100 = 1000
    --------------+------------+------------+
    module        |      1/min |        min |
    --------------|------------|------------|
    JP::valid     | 303935.072 |  0.0000329 |
    JSON::Parse   |  47662.545 |  0.0002098 |
    JSON::XS      |  28493.913 |  0.0003510 |
    --------------+------------+------------+
cpantesters.json
    Repetitions: 10 x 100 = 1000
    --------------+------------+------------+
    module        |      1/min |        min |
    --------------|------------|------------|
    JP::valid     |   1401.371 |  0.0071359 |
    JSON::Parse   |    209.319 |  0.0477741 |
    JSON::XS      |    207.542 |  0.0481830 |
    --------------+------------+------------+

SEE ALSO ^

RFC 7159

JSON is specified in RFC 7159 "The application/json Media Type for JavaScript Object Notation (JSON)".

json.org

http://json.org is the website for JSON, authored by Douglas Crockford.

JSON::Create

JSON::Create is a companion module to JSON::Parse by the same author. As of version 0.08, I'm using it everywhere, but it should still be considered to be in a testing stage. Please feel free to try it out.

Other CPAN modules for parsing and producing JSON

JSON

This is actually a combination module for JSON::PP and JSON::XS.

JSON::PP

Part of the Perl core. JSON in Perl-only without the XS (C-based) parsing. This is slower but may be necessary if you cannot install modules requiring a C compiler.

JSON::XS

All-purpose JSON module in XS (requires a C compiler to install).

Cpanel::JSON::XS

Fork of JSON::XS related to a disagreement about how to report bugs. Please see the module for details.

JSON::DWIW

"Does what I want" module.

JSON::YAJL

Wraps a C library called yajl.

JSON::Util

Relies on JSON::MaybeXS.

Pegex::JSON

Based on Pegex.

JSON::Streaming::Reader and JSON::Streaming::Writer
JSON::Syck

Takes advantage of a similarity between YAML (yet another markup language) and JSON to provide a JSON parser/producer using YAML::Syck.

JSON::SL
Inline::JSON

Relies on JSON.

JBD::JSON

The module is undocumented so I am not sure what it does.

Glib::JSON

Uses the JSON library from Glib, a library of C functions for the Linux GNOME desktop project.

Mojo::JSON

Part of the Mojolicious standalone web framework, "pure Perl" JSON reader/writer. As of version 6.25 of Mojolicious, this actually depends on JSON::PP.

JSON::Tiny

A fork of Mojo::JSON.

Special-purpose modules
JSON::MultiValueOrdered and JSON::Tiny::Subclassable

JSON::MultiValueOrdered is a special-purpose module for parsing JSON objects which have key collisions (something like {"a":1,"a":2}) within objects.

(JSON::Parse's handling of key collisions is discussed in "Key collisions" in this document.)

Test::JSON

This offers a way to compare two different JSON strings to see if they refer to the same object.

JSON::XS::Sugar
boolean

This module offers true and false literals similar to JSON.

Type-related modules
JSON::Types

This untangles the messy Perl representation of numbers, strings, and booleans into JSON types.

JSON::TypeInference
JSON::Typist
Combination modules

These modules present a more consistent and improved interface which can rely on more than one of the above back-end modules at once. This protects the user from incompatible changes in module APIs, and by relying on more than one back-end the users are also protected from the personality clashes between various temperamental module maintainers. Many CPAN modules involving JSON now rely on a "master module" rather than using the above JSON modules directly.

JSON::MaybeXS

A "combination module", the currently fashionable choice, which combines Cpanel::JSON::XS, JSON::XS, and the original JSON.

JSON::Any

A now-deprecated "combination module" which combines JSON::DWIW, JSON::XS versions one and two, and JSON::Syck.

JSON::XS::VersionOneAndTwo

A "combination module" which supports two different interfaces of JSON::XS. However, JSON::XS is now onto version 3.

Mojo::JSON::MaybeXS

Pulls in JSON::MaybeXS instead of Mojo::JSON.

JSON extensions

These modules extend JSON with comments and other things.

JSON::Relaxed

"An extension of JSON that allows for better human-readability".

JSONY

"Relaxed JSON with a little bit of YAML"

JSON::Diffable

"A relaxed and easy diffable JSON variant"

There are also a lot of modules in the CPAN JSON:: namespace which use JSON as a basis for other things, but with apologies I don't try to cover those modules here, since there are so many of them.

SCRIPT ^

A script "validjson" is supplied with the module. This runs "assert_valid_json" on its inputs, so run it like this.

     validjson *.json

The default behaviour is to just do nothing if the input is valid. For invalid input it prints what the problem is:

    validjson ids.go 
    ids.go: JSON error at line 1, byte 1/7588: Unexpected character '/' parsing initial state: expecting whitespace: '\n', '\r', '\t', ' ' or start of string: '"' or digit: '0-9' or minus: '-' or start of an array or object: '{', '[' or start of literal: 't', 'f', 'n' at /home/ben/software/install/bin/validjson line 21.

If you need confirmation, use its --verbose option:

    validjson -v *.json

    atoms.json is valid JSON.
    ids.json is valid JSON.
    kanjidic.json is valid JSON.
    linedecomps.json is valid JSON.
    radkfile-radicals.json is valid JSON.

The script uses Path::Tiny for reading files, which is not a dependency of this module, so if you want to use the script, you also need to install Path::Tiny.

TEST RESULTS ^

The CPAN testers results are at the usual place.

The ActiveState test results are at http://code.activestate.com/ppm/JSON-Parse/.

EXPORTS ^

The module exports nothing by default. Functions "parse_json", "parse_json_safe", "json_file_to_perl", "valid_json" and "assert_valid_json", as well as the old function names "validate_json" and "json_to_perl", can be exported on request.

All of the functions can be exported using the tag ':all':

    use JSON::Parse ':all';

TESTING ^

The module incorporates extensive testing related to the production of error messages and validation of input. Some of the testing code is supplied with the module in the /t/ subdirectory of the distribution.

More extensive testing code is in the git repository. This is not supplied in the CPAN distribution. A script, randomjson.pl, generates a set number of bytes of random JSON and checks that the module's bytewise validation of input is correct. It does this by taking a valid fragment, then adding each possible byte from 0 to 255 to see whether the module correctly identifies it as valid or invalid at that point, then randomly picking one of the valid bytes and adding it to the fragment and continuing the process until a complete valid JSON input is formed. The module has undergone about a billion repetitions of this test.

This setup relies on a C file, json-random-test.c, which isn't in the CPAN distribution, and it also requires Json3.xs to be edited to make the macro TESTRANDOM true (uncomment line 7 of the file). The testing code uses C setjmp/longjmp, so it's not guaranteed to work on all operating systems and is commented out for CPAN releases.

A pure C version called random-test.c also exists. This applies exactly the same tests, and requires no Perl at all.

If you're interested in testing your own JSON parser, the outputs generated by randomjson.pl are quite a good place to start. The default is to produce UTF-8 output, which looks pretty horrible since it tends to produce long strings of UTF-8 garbage. (This is because it chooses randomly from 256 bytes and the end-of-string marker " has only a 1/256 chance of being chosen, so the strings tend to get long and messy). You can mess with the internals of JSON::Parse by setting MAXBYTE in json-common.c to 0x80, recompiling (you can ignore the compiler warnings), and running randomjson.pl again to get just ASCII random JSON things. This breaks the UTF-8 functionality of JSON::Parse, so please don't install that version.

HISTORY ^

Or "why did you make yet another JSON module?"

This module started out under the name JSON::Argo. It was originally a way to escape from having to use the other JSON modules on CPAN.

The reason it only parsed JSON was that when I started this I didn't know the Perl extension language XS very well, and I was not confident about making a JSON producer, so it only parsed JSON, which was the main job I needed to do. It originally used lex and yacc in the form of flex and bison, since discarded. I also found out that someone else had a JSON parser called Argo in Java, so to save confusion I dropped the name JSON::Argo and renamed this JSON::Parse, keeping the version numbers continuous.

The module has since been completely rewritten, twice, mostly in an attempt to improve performance, after I found that JSON::XS was much faster than the original JSON::Parse. (The first rewrite of the module was not released to CPAN, this is the second one, which explains why some files have names like Json3.xs). I also hoped to make something useful which wasn't in any existing CPAN module by offering the high-speed validator, "valid_json".

I also rewrote the module due to some bugs I found, for example up to version 0.09 it was failing to accept whitespace after an object key string, so a JSON input of the form { "x" : "y" }, with whitespace between the "x" and the colon, :, would cause it to fail. That was one big reason I created the random testing regime described in "TESTING" above. I believe that the module is now compliant with the JSON specification.

After starting JSON::Create, I realised that some edge case handling in JSON::Parse needed to be improved. This resulted in the addition of the hash collision and literal-overriding methods introduced in versions 0.37 and 0.38 of this module.

ACKNOWLEDGEMENTS ^

Shlomi Fish (SHLOMIF) fixed some memory leaks in version 0.40.

AUTHOR ^

Ben Bullock, <bkb@cpan.org>

Request

If you'd like to see this module continued, let me know that you're using it. For example, send an email, write a bug report, star the project's github repository, add a patch, add a ++ on Metacpan.org, or write a rating at CPAN ratings. It really does make a difference. Thanks.

COPYRIGHT & LICENCE ^

This package and associated files are copyright (C) 2013-2016 Ben Bullock.

You can use, copy, modify and redistribute this package and associated files under the Perl Artistic Licence or the GNU General Public Licence.

TERMINOLOGY ^

This defines the terminology used in this document.

Convenience function

In this document, a "convenience function" indicates a function which solves some of the problems, some of the time, for some of the people, but which may not be good enough for all envisaged uses. A convenience function is an 80/20 solution, something which solves (about) 80% of the problems with 20% of the effort. Something which does the obvious things, but may not do all the things you might want, a time-saver for the most basic usage cases.

BUGS

In this document, the section BUGS describes possible deficiencies, problems, and workarounds with the module. It's not a guide to bug reporting, or even a list of actual bugs. The name "BUGS" is the traditional name for this sort of section in a Unix manual page.

syntax highlighting: