NAME

Zoidberg::StringParser - Simple string parser

SYNOPSIS

        my $base_gram = {
            esc => '\\',
            quotes => {
                q{"} => q{"},
                q{'} => q{'},
            },
        };

        my $parser = Zoidberg::StringParser->new($base_gram);

        my @blocks = $parser->split(
            qr/\|/, 
            qq{ls -al | cat > "somefile with a pipe | in it"} );

        # @blocks now is: 
        # ('ls -al ', ' cat > "somefile with a pipe | in it"');
        # So it worked like split, but it respected quotes

DESCRIPTION

This module is a simple syntax parser. It originaly was designed to work like the built-in split function, but to respect quotes. The current version is a little more advanced: it uses user defined grammars to deal with delimiters, an escape char, quotes and braces.

Yes, I know of the existence of Text::Balanced, but I wanted to do this the hard way :)

All grammars and collections of grammars should be considered PRIVATE when used by a Z::SP object.

EXPORT

None by default.

GRAMMARS

TODO

esc: FIXME

Collection

The collection hash is simply a hash of grammars with the grammar names as keys. When a collection is given all methods can use a grammar name instead of a grammar.

Base grammar

This can be seen as the default grammar, to use it leave the grammar undefined when calling a method. If this base grammar is defined and you specify a grammar at a method call, the specified grammar will overload the base grammar.

METHODS

new(\%base_grammar, \%collection, \%settings)

Simple constructor. See "Collection", "Base grammar" and "settings" for explanation of the arguments.

split($grammar, $input, $int)

Splits $input as specified by $grammar,

$input can be either a string or a reference to an array of strings. Such a array reference is used as provided, so it should be possible to use for example tied arrays here.

$int is an optional arguments specifying the maximum number of parts the input should be splitted in. Remaining strings are joined and returned as the last part. If you use a grammar with named tokens these are not counted as a part of the string.

Blocks will by default be passed as scalar refs (unless the grammar's meta function altered them) and tokens as scalars. To be a little compatible with CORE::split all items (blocks and tokens) are passed as plain scalars if $grammar is or was a Regexp reference. ( This behaviour can be faked by giving your grammr a value called 'was_regexp'. ) This behaviour is turned off by the "no_split_intel" setting.

settings

The %settings hash contains options that control the general behaviour of the parser. Supported settings are:

allow_broken: If this value is set the parser will not throw an exception if for example an unmatched quote occurs
no_esc_rm: Boolean that tells the parser not to remove the escape char when an escaped token is encountered. Double escapes won't be replaced either. Usefull when a string needs to go through a chain of parsers.
no_split_intel: Boolean, disables "intelligent" behaviour of split() when set.

AUTHOR

Jaap Karssenberg || Pardus [Larus] <pardus@cpan.org>

Contains some code derived from Tie-Hash-Stack-0.09 by Michael K. Neylon.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)