pod/Event.pod - metacpan.org

# Marpa::R3 is Copyright (C) 2018, Jeffrey Kegler.
#
# This module is free software; you can redistribute it and/or modify it
# under the same terms as Perl 5.10.1. For more details, see the full text
# of the licenses in the directory LICENSES.
#
# This program is distributed in the hope that it will be
# useful, but it is provided "as is" and without any express
# or implied warranties. For details, see the full text of
# of the licenses in the directory LICENSES.

=head1 NAME

Marpa::R3::Event - Parse events

=head1 Synopsis

=for Marpa::R3::Display
name: Event synopsis
normalize-whitespace: 1
partial: 1

    my @events = ();
    my @event_history = ();
    my $next_lexeme_length = undef;

    my $slr = Marpa::R3::Recognizer->new( { grammar => $grammar,
        event_handlers => {
            "'default" => sub () {
                my ($slr, $event_name, undef, undef, undef, $length) = @_;
                if ($event_name eq 'insert d') {
                   $next_lexeme_length = $length;
                }
                push @events, $event_name;
                'pause';
            }
        }
    } );

    push @event_history, join q{ }, "Events at position 0:", sort @events
       if @events;
    @events = ();

    my $input = q{a b c "insert d here" e e f h};
    my $length = length $input;
    my $pos    = $slr->read( \$input );

    READ: while (1) {

        push @event_history, join q{ }, "Events at position $pos:", sort @events
           if @events;
        @events = ();

        if (defined $next_lexeme_length) {
            $slr->lexeme_read_literal('real d', undef, undef, $next_lexeme_length);
            (undef, $pos) = $slr->block_progress();
            $next_lexeme_length = undef;
            next READ;
        }
        if ($pos < $length) {
            $pos = $slr->resume();
            next READ;
        }
        last READ;
    } ## end READ: while (1)

=for Marpa::R3::Display::End

The synopsis is extracted from an example given in full
L<below|/"A messy example">.

=head1 About this document

This document is an overview of B<parse events>.
One example of a circumstance for
which an event can be created is the prediction of a symbol
in the parse.
Another example is the recognition of a symbol in the parse.
A third example is parse exhaustion.

Parse events are often used to allow an application to
switch over to its own custom procedural logic.
Among other things,
an application can do its own "external" scanning of lexemes.
An application may ask Marpa to resume internal scanning at
any point.

=head1 Terminology

=over 4

=item * In contexts where the meaning is clear,
Parse events are called B<events>.

=item * In this document, an B<instance> of a symbol
in a parse
means an occurrence
of the symbol in that parse
with a specific start location and
a specific length.
An instance of a symbol is also called
a B<symbol instance>.
A consequence of this definition
is that every symbol instance
has exactly one end location.

=item * A B<potential symbol instance> is one
which might appear in the parse.

=item * An B<actual symbol instance> is one which
actually appears in the parse.

=item * For a specific string
and a specific grammar,
we say that a parse is B<valid>
if some parse of the string
exists
according to the grammar.

=item * In a parse,
a B<nulled symbol instance>,
or B<nulled symbol>, is a symbol instance with a length of zero.

=item * A B<non-nulled symbol instance>,
non-nulled instance,
or B<non-nulled symbol>, is a symbol instance which is not
nulled.

=item * A symbol in a grammar is
a B<nullable symbol> if it has
at least one nulled instance
in at least one valid parse of at least one string.

=item * A symbol in a grammar is
a B<nulling symbol> if,
for all strings,
all of its instances in valid parses are nulled instances.

=item * A symbol in a grammar is
B<non-nulling> if it is not a nulling symbol.

=item * A string is B<actual to location L> if we
know the actual input as far as I<L>,
and the string is consistent with the
actual input.

=item * If we know the input as far as location I<L>
and a symbol I<< <S> >>
that starts at I<L>
is in some parse of some string actual to I<L>,
then
we say that symbol I<< <S> >> is B<acceptable at
a location L>.
Where location I<L> is understood, we just say
that symbol I<< <S> >> is B<acceptable>.

=item * If a symbol I<< <S> >> ends at I<L>,
and we know the parse as far
as location I<L>,
we say that symbol I<< <S> >>
is B<recognized at
a location L>.

=back

For those readers who find formal definitions useful,
some of this terminology is defined more
carefully
L<below|"Details">.

=head1 Types of parse event

Prase events are of two types:

=over 4

=item * B<Instance events> report
the status of one particular symbol instance,
which may be actual or potential.
Prediction events,
for example,
report potential symbol instances.
Instance events are declared in the Marpa::R3 DSL.

=item * B<Location events> report the general status
of the parse at particular location,
rather than the status of one particular instance.
The symbol instance whose status the event
reports does not have to be an
actual symbol instance.

=back

Any event which is not an instance event, is a
location event.
There are two types
of location event:
rejection and exhaustion.
A rejection event is declared using
the grammar's C<rejection> named argument.
A exhaustion event is declared using
the grammar's C<exhaustion> named argument.

An instance event is either
a B<lexeme event> or
a B<non-lexeme event>.
A lexeme event is declared using a
L<C<:lexeme> pseudo-rule|Marpa::R3::DSL/"Lexeme pseudo-rule">.
A non-lexeme event is declared using a
L<named event statement|Marpa::R3::DSL/"Named event statement">.

The various types of parse events are described in detail
L<below|/"Types of parse event">.
The description of
each type of parse event
will state whether it is a location event,
a lexeme event
or a non-lexeme event.

=head1 The life cycle of events

=over 4

=item * An parse event must be B<declared>.

=item * A declared event may B<trigger>.

=back

Once declared, a parse event
may trigger during any event-triggering
recognizer method.
The event-triggering recognizer methods are
L<the recognizer's C<new()>
constructor|Marpa::R3::Recognizer/Constructor>,
L<C<read()>|Marpa::R3::Recognizer/"read()">,
L<C<resume()>|Marpa::R3::Recognizer/"resume()">,
L<C<lexeme_read_block()>|Marpa::R3::Recognizer/"lexeme_read_block()">,
L<C<lexeme_read_literal()>|Marpa::R3::Recognizer/"lexeme_read_literal()">,
L<C<lexeme_read_string()>|Marpa::R3::Recognizer/"lexeme_read_string()">
and the
L<C<lexeme_complete()>|Marpa::R3::Recognizer/"lexeme_complete()"> method.

The location at which a parse event triggers is the B<trigger location>.
An event may trigger at any parse location, including location 0.
It is important to keep in mind that location 0 events will
trigger in
L<the recognizer's C<new()>
constructor|Marpa::R3::Recognizer/Constructor>.

In addition to a trigger location,
an event has an B<event location>.
The event location is the same as
the trigger location,
except for pre-lexeme events.
For pre-lexeme events,
the event location is before
the trigger location.

When an event triggers,
the event handler may B<pause> the recognizer.
If an event handler pauses the recognizer,
it causes the
event-triggering method
to return as soon as all processing at the
B<event location>
is finished, but before
any processing at later locations.

Some event-triggering methods always return
after processing at each parse location is finished,
and for those methods, it makes no difference
if one of the event handlers
pauses the recognizer.
But,
in the case of the
L<C<read()>|Marpa::R3::Recognizer/"read()"> and
the L<C<resume()>|Marpa::R3::Recognizer/"resume()">
methods,
if an event handler pauses the recognizer,
the method may return control to the user
earlier than it otherwise would.

Recall that, in the case of
L<pre-lexeme events|/"Pre-lexeme events">,
the event location is before
the trigger location.
This means,
by using a pre-lexeme event to pause
the recognizer,
a Marpa::R3 app can enforce a form of look-ahead.

An event will only trigger if activated.
By default,
all events are activated when declared.
The triggering of
parse instance events may be controlled with
the L<activate() method|Marpa::R3::Recognizer/"activate()">.

=head1 Event handlers

Events are processed using Perl closures 
called B<event handlers>.
Event handlers are specified using the
L<C<event_handlers> named
argument|Marpa::R3::Recognizer/"event_handlers">
of
L<the recognizer's C<new()>
constructor|Marpa::R3::Recognizer/Constructor>.

Two or more arguments are
passed to the event handler.
The first argument is always the recognizer.
Subsequent arguments are the event data.
The first element of the event data,
which is the second argument of the event handler,
is always the event name.
Whether there is more than one element in the
event data,
and therefore what the third and later arguments
to the event handler are,
depends on the type of event.
L<Below|/"Types of parse event">,
the details of the event data arguments are
described
for each type of event.

An event handler is expected to return a single
value.
This value may be either a string,
or a Perl C<undef>.
If the return value of the event handler is
a string,
it must be either C<'pause'> or C<'ok'>.
If the return value of the event handler is
a Perl C<undef>,
it will be treated as if it was the string
C<'ok'>.

Let I<L> be a parse location.
The recognizer pauses at I<L>
if any of event handlers triggered
at I<L> return C<'pause'>.
If none of event handlers triggered
at I<L> return
C<'pause'>,
the recognizer returns control to
the app at I<L>
only if the method
that triggered the event would ordinarily
return control to the app at I<L>.

A return of C<'ok'>
by a parse handler
can be thought of as a command to
"continue as you ordinarily would",
and a C<'pause'> return
as an instruction to pause
the recognizer at the current
location.
From this point of view,
a C<'pause'> return value is seen to
override
any number of C<'ok'> return values
at the same parse location.

Event handlers are Perl closures,
and an event handler has access to those variables
in scope at the point where it is defined.
When a handler needs to access data
visible only in
multiple, distant scopes,
a "factory" can be used.
L<Below|/"Examples">
are several examples of event
handlers,
including one where
the event handlers are instantiated
by a factory closure.

An event handler
may not call, directly or indirectly,
certain methods of the same recognizer.
The restricted methods include all those
recognizer methods which may themselves
trigger parse events.
The restricted methods are
L<C<read()>|Marpa::R3::Recognizer/"read()">,
L<C<resume()>|Marpa::R3::Recognizer/"resume()">,
L<C<lexeme_read_block()>|Marpa::R3::Recognizer/"lexeme_read_block()">,
L<C<lexeme_read_literal()>|Marpa::R3::Recognizer/"lexeme_read_literal()">,
L<C<lexeme_read_string()>|Marpa::R3::Recognizer/"lexeme_read_string()">
and
L<C<lexeme_complete()>|Marpa::R3::Recognizer/"lexeme_complete()">.

This restriction applies only to methods of the recognizer
to which the event handler belongs.
Calls to methods of a different recognizer are
not restricted.

=head1 Types of parse event

=head2 Completion events

Completion events are declared in the Marpa::R3 DSL
using the
L<named event statement|Marpa::R3::DSL/"Named event statement">:

=for Marpa::R3::Display
name: completed event statement synopsis
partial: 1
normalize-whitespace: 1

    event 'a' = completed A
    event 'b'=off = completed B
    event 'c'=on = completed C
    event 'd' = completed D

=for Marpa::R3::Display::End

A completion event triggers
when a non-nulled instance of its symbol
is recognized at the current location.
Completion events are non-lexeme events.
A completion parse event can be specified for any
symbol that is not a lexeme.

When a completion event triggers,
its event location
is set to the current location,
which will be the end location
of the instance that triggered the event.
The event is called a "completion"
because, at the event location,
the recognition of its symbol
is "complete".

A completion event passes two arguments to
its event handler:
the recognizer,
and the event name.

=head2 Discard events

=for Marpa::R3::Display
name: discard event statement synopsis 2
partial: 1
normalize-whitespace: 1

    :discard ~ ws event => ws
    ws ~ [\s]+
    :discard ~ [,] event => comma=off
    :discard ~ [;] event => 'semicolon'=on
    :discard ~ [.] event => period

=for Marpa::R3::Display::End

Discard events are specified in
L<discard pseudo-rules|Marpa::R3::DSL/"Discard pseudo-rule">.
They are non-lexeme events.
This may seem counter-intuitive,
but a lexeme must be a symbol visible to the G1
grammar --
discarded symbols
are discarded
before the G1 grammar can see them.

When a discard event triggers,
its event location is set to the current location.
This will be the end location of the discarded text.

A discard event passes
the following arguments to its event handler,
in order:

=over 4

=item *

The recognizer.

=item *

The name of the discard event.

=item *

The index of the block containing the discarded text.

=item *

The start position of the discarded text,
as a zero-based offset from the start of the
block.

=item *

The length, in codepoints, of the discarded text.

=item *

The G1 location of the last lexeme read.

=back

An intended purpose of the G1 location is to allow
the synchronization of data taken from discard events
with the parse tree.
L0 locations are not sufficient to do this,
because Marpa::R3 allows an app to move
backward and forward both within
and between blocks of input.
G1 locations are always in left-to-right order
from the point of view of parse tree.

Note that, since discarded text is never
actually seen by G1,
in the strict sense
it cannot have a G1 location.
The G1 location
reported with the discard event
is that of the last lexeme read before
the discarded text.
All lexemes have G1 locations.
If the discarded text is at the beginning of the parse,
before any lexemes have been read,
the G1 location is reported as zero.

=head2 Nulling events

A nulling event is declared in the Marpa::R3 DSL
using the
L<named event statement|Marpa::R3::DSL/"Named event statement">:

=for Marpa::R3::Display
name: nulled event statement synopsis
partial: 1
normalize-whitespace: 1

    event '!a' = nulled A
    event '!b'=off = nulled B
    event '!c'=on = nulled C
    event '!d' = nulled D

=for Marpa::R3::Display::End

A nulling parse event occurs when a nulled instance
of its symbol is recognized at the current location.
When a nulling event triggers,
its event location
is set to the current location,
which will be
the location where the triggering instance both begins and ends.

A nulling event is a non-lexeme event.
A nulling parse event can be specifed for any
symbol that is not a lexeme.
A nulled symbol may derive other null symbols,
producing one or more nulled trees;
because a null derivation may be ambiguous,
a nulled symbol may derive more than one nulled
tree.
A set of one or more nulled trees
is called a B<nulled forest>.

When a nulling event triggers for a symbol instance,
all activated nulling events declared
for symbols derived
from the triggered symbol instance will
also trigger.
The triggering of nulling events is recursive,
so that when a nulled symbol instance
triggers an event, it triggers all the events
in the nulled forest derived
from the triggering symbol instance.
Nulled forests are described in more detail
in L<a separate
section|"Nulled forests">.

A nulling event passes two arguments to
its event handler:
the recognizer,
and the event name.

=head2 Prediction events

A prediction event is declared in the Marpa::R3 DSL
using the
L<named event statement|Marpa::R3::DSL/"Named event statement">:

=for Marpa::R3::Display
name: predicted event statement synopsis
partial: 1
normalize-whitespace: 1

    event '^a' = predicted A

=for Marpa::R3::Display::End

A prediction event triggers when
a non-nulling symbol is acceptable at the current location.
When a prediction event triggers,
its event location
is set to the current location.
A prediction may not result in an actual instance of the symbol,
but no actual symbol instance can start
at the event location unless a prediction,
if properly declared and activated,
would trigger at that location.

Prediction parse events may be defined for any symbol,
whether it is a lexeme or not.
But prediction events are non-lexeme events,
even when their symbol is a lexeme.

A prediction event passes two arguments to
its event handler:
the recognizer,
and the event name.

=head2 Pre-lexeme events

=for Marpa::R3::Display
name: Event synopsis
partial: 1
normalize-whitespace: 1

    :lexeme ~ <insert d> pause => before event => 'insert d'

=for Marpa::R3::Display::End

A pre-lexeme event is a lexeme event.
It triggers if the lexeme is scanned at the current location.
When a pre-lexeme event triggers,
its event location
is set to the location where the lexeme starts,
which will be before the trigger location.

The recognizer will B<not> have
read the lexeme
when its pre-lexeme event triggers.
In effect,
a pre-lexeme event "rewinds" the scanning.

For most events, the trigger location is the event location,
but pre-lexeme events are the exception.
Pre-lexeme events set
the event location to the start of the lexeme,
which is consistent with the pre-lexeme event's behavior as a "rewind".
An intended use of pre-lexeme events
is catching a lexeme which
is about to be read, and giving it special treatment.
For more on this, see
L<below|"External scanning">.

There is a lot of similarity
between pre-lexeme events and predictions,
but there are also very important differences.

=over 4

=item *

A pre-lexeme event does not occur
unless the triggering lexeme is actually
found in the input.
On the other hand,
a prediction event is,
as the name suggests, only a prediction --
the triggering lexeme may never
actually be found in the input.

=item *

Even though they have the same
event location,
pre-lexeme and prediction events
do not trigger at the same time,
because pre-lexeme events
require a scan of the lexeme,
while prediction events do not.
If both
are defined for a symbol,
the prediction event will trigger first,
B<before> the lexeme is scanned.
The pre-lexeme event will trigger later,
B<after> the lexeme is scanned.

=item *

Pre-lexeme events can be defined only
for lexemes.
Prediction events can be defined for any
symbol.

=back

A pre-lexeme event passes
the following arguments to its event handler,
in order:

=over 4

=item *

The recognizer.

=item *

The name of the pre-lexeme event.

=item *

The symbol ID of the lexeme.

=item *

The index of the block containing the lexeme.

=item *

The start position of the lexeme,
as a zero-based offset from the start of the
block.

=item *

The length, in codepoints, of the lexeme.

=back

=head2 Post-lexeme events

=for Marpa::R3::Display
name: Event synopsis
partial: 1
normalize-whitespace: 1

    :lexeme ~ <a> pause => after event => '"a"'

=for Marpa::R3::Display::End

A post-lexeme event is a lexeme event.
It triggers if the lexeme is scanned at the current location.
The recognizer will have
already read the lexeme
when its post-lexeme event triggers.

When a post-lexeme event triggers,
its event location
is set to the current location,
which will also be the location where the lexeme ends.

A post-lexeme event passes
the following arguments to its event handler,
in order:

=over 4

=item *

The recognizer.

=item *

The name of the post-lexeme event.

=item *

The symbol ID of the lexeme.

=item *

The index of the block containing the lexeme.

=item *

The start position of the lexeme,
as a zero-based offset from the start of the
block.

=item *

The length, in codepoints, of the lexeme.

=back

=head2 Exhaustion events

=for Marpa::R3::Display
name: exhaustion grammar setting synopsis part 1

    my $g = Marpa::R3::Grammar->new(
        {
            source     => \$dsl,
            exhaustion => 'event',
            rejection  => 'event',
        }
    );

=for Marpa::R3::Display::End

=for Marpa::R3::Display
name: exhaustion grammar setting synopsis part 2
partial: 1
normalize-whitespace: 1

        my @shortest_span = ();

        my %event_handlers1 = (
              'target' => sub {
                   my ($slr) = @_;
                   my (undef, $pos) = $slr->block_progress();
                   @shortest_span = $slr->last_completed('target');
                   diag(
                       "Preliminary target at $pos: ",
                       $slr->g1_literal(@shortest_span)
                   ) if $verbose;
                  return 'pause';
              },
              q{'exhausted} => sub {
                  return 'pause';
              }
        );

        my $recce =
          Marpa::R3::Recognizer->new( { grammar => $g,
              event_handlers => \%event_handlers1,
          }, $recce_debug_args );
        my $pos = $recce->read( \$string, $target_start );

=for Marpa::R3::Display::End

Exhaustion events are location events.
An exhaustion parse event triggers on asynchronous parse exhaustion,
if the grammar's C<exhaustion> setting is "C<event>".
The name of the event is
"C<'exhausted>"
(The initial single quote is part of the event's name,
and indicates it is a reserved name,
which will not conflict with
the name of any user-named event.)

Intuitively, parse exhaustion events are created only when
needed for control to return to the application.
More formally,
a parse exhaustion event is called B<asynchronous> if it
occurs
in a method, and at a location,
where the method would have continued
reading under "ordinary circumstances".
In this context, "ordinar circumstances" means

=over 4

=item * that parse exhaustion has not occurred, and

=item * that no event handler has paused the recognizer.

=back

A parse exhaustion event is called B<synchronous> if it is not
asynchronous.

Parse exhaustion in the
L<C<lexeme_read_block()>|Marpa::R3::Recognizer/"lexeme_read_block()">,
L<C<lexeme_read_literal()>|Marpa::R3::Recognizer/"lexeme_read_literal()">,
L<C<lexeme_read_string()>|Marpa::R3::Recognizer/"lexeme_read_string()">
and the
L<C<lexeme_complete()>|Marpa::R3::Recognizer/"lexeme_complete()">
methods
is always synchronous,
because they always
return control to the app after every attempt to
read input -- they never try to continue reading input.
Parse exhaustion in
L<the recognizer's C<new()>
constructor|Marpa::R3::Recognizer/Constructor>
is always synchronous, because it can only occur
if the grammar is nulling.
Parse exhausion in
the L<C<read()>|Marpa::R3::Recognizer/"read()">
or the L<C<resume()>|Marpa::R3::Recognizer/"resume()">
methods
may be either synchronous or
asynchronous.

Exhaustion events
are typically used to ensure that the recognizer
pauses,
instead of throwing a fatal error.
For purposes of subsequent processing,
an app often cares whether a parse is exhausted
or not,
but far less often cares
whether that exhaustion is synchronous or asynchronous.
Since an exhaustion event
only occurs on asynchronous exhaustion,
apps will often not use the event to determine whether
the parse is exhausted,
but will instead call
the L<C<exhausted()> method|Marpa::R3::Recognizer/"exhausted()">.
The C<exhausted()> method
reports
both asynchronous and synchronous parse exhaustion.
Exhaustion is discussed further in
L<a separate document|Marpa::R3::Exhaustion>.

=head2 Rejection events

=for Marpa::R3::Display
name: rejection grammar setting synopsis part 1
normalize-whitespace: 1

    my $g = Marpa::R3::Grammar->new(
        {
            source    => \($grammar),
            rejection => 'event'
        }
    );

=for Marpa::R3::Display::End

=for Marpa::R3::Display
name: rejection grammar setting synopsis part 2
partial: 1
normalize-whitespace: 1

    my $rejection = 0;
    my $pos;

    my $recce =
      Marpa::R3::Recognizer->new( { grammar => $g,
         event_handlers => {
             q{'rejected} => sub {
                $rejection = 1;
                diag("You fool! you forget the semi-colon at location $pos!")
                    if $verbose;
                return 'pause'
             }
         },
      }, $recce_debug_args );
    $pos = $recce->read( \$suffixed_string, 0, $original_length );

    READ_LOOP: while (1) {
        (undef, $pos) = $recce->block_progress();
        last READ_LOOP if not $rejection;
        $recce->resume( $original_length, 1 );
        diag("I fixed it for you.  Now you owe me.") if $verbose;
        $rejection = 0;
        $recce->resume( $pos, $original_length - $pos );
    } ## end READ_LOOP: while (1)

=for Marpa::R3::Display::End

Rejection events are location events.
A rejection event triggers if
all lexemes at a G1 location are rejected,
and the grammar's C<rejection> setting is "C<event>".
The name of the event is
"C<'rejected">
(The initial single quote is part of the event's name,
and indicates it is a reserved name,
which will not conflict with the name
of any user-named event.)

=head1 Triggering

=head2 Lexeme events

A lexeme event will trigger at the current location
if all of the following criteria,
applied in order, are true:

=over 4

=item *

It is declared in a
L<C<:lexeme> pseudo-rule|Marpa::R3::DSL/"Lexeme pseudo-rule">.

=item *

Its lexeme has been scanned by the L0 grammar at that location.

=item *

The G1 grammar would accept its lexeme at that location.

=item *

The event is activated.
An event is activated by default when it
is declared.
Deactivation and reactivation of events is
done with the recognizer's
L<activate() method|Marpa::R3::Recognizer/"activate()">.

=item *

Its lexeme priority is higher than, or equal to,
that of any other lexeme
remaining after the previous criteria
have been applied.

=item *

If it is a post-lexeme event,
none of other remaining events are pre-lexeme events.
(In other words, a pre-lexeme event prevents
post-lexeme events from triggering at the same location.)

=back

Marpa allows ambiguous lexemes and,
even after all the above criteria have been applied,
there may be more than one lexeme event at a location.

=head2 Non-lexeme events

Prediction, completion and nulling events are non-lexeme events.
The conditions for a non-lexeme event are simpler than those for
a lexeme event, because they do not involve lexical processing.

A non-lexeme event will trigger at the current location
if all of the following are true:

=over 4

=item *

It is declared in a
L<named event statement|Marpa::R3::DSL/"Named event statement">.

=item *

Its B<triggering condition> is true.  Specifically,

=over 4

=item *

It is a prediction and its symbol is acceptable at the current location; or

=item *

it is a completion or a nulling event and its symbol is recognized
at the current location; or

=back

=item *

The event is activated.
An event is activated by default when it
is declared.
Deactivation and reactivation of events is
done with the recognizer's
L<activate() method|Marpa::R3::Recognizer/"activate()">.

=back

=head2 Location events

A location event will trigger at the current location
if all of the following are true:

=over 4

=item *

It is declared using
L<the C<event_handlers> named
argument|Marpa::R3::Recognizer/"event_handlers">
of
L<the recognizer's C<new()>
constructor|Marpa::R3::Recognizer/Constructor>.

=item *

Its B<triggering condition> is true.  Specifically,

=over 4

=item *

it is an exhaustion event, and asynchronous parse exhaustion,
L<as defined above|/"Exhaustion events">,
occurs at the current location; or

=item *

it is an rejection event, and all lexeme alternatives are rejected
at the current location.

=back

=item *

The event is activated.
An event is activated by default when it
is declared.
Deactivation and reactivation of events is
done with the recognizer's
L<activate() method|Marpa::R3::Recognizer/"activate()">.

=back

=head1 Techniques

=head2 External scanning

Switching to external scanning is an intended use case
for all events, other than exhaustion events.
In particular,
the behavior of pre-lexeme events
is most intuitive when thought about with
external scanning in mind.

L<The example code for this document|/"A messy example">
contains an artificially simple example of external
scanning.
The symbol C<< <insert d> >> has a pre-lexeme event
declared:

=for Marpa::R3::Display
name: Event synopsis
partial: 1
normalize-whitespace: 1

    :lexeme ~ <insert d> pause => before event => 'insert d'

=for Marpa::R3::Display::End

When this event triggers,

=over 4

=item * Marpa::R3 returns control to the app
without reading the lexeme actually found
in the physical input;

=item * the app
reads a C<< <d> >> symbol externally,
and

=item * the app resumes internal scanning.

=back

=head2 Markers

It is quite reasonable to create "markers" --
nulling symbols
whose primary (or sole) purpose
is to have nulling events declared for them.
Markers are the only way to declare events that trigger in
the middle of a rule.

=head2 Rules

There are no events explicitly defined in terms of rules,
but every rule event that is wanted can be achieved in
one or more ways.
The most flexible of these, and the best for many purposes,
is to use L<markers|/"Markers">.

Another method is to use the LHS of a rule to track rule
predictions and completions.
This requires that the LHS symbol of the rule be unique to that
rule.

=head1 Implications

This section describes
some implications of the parse events mechanism
that may be unexpected at first.
These implications are Marpa working as designed and,
I hope the reader will agree,
as is desirable.

=head2 Ambiguity

If a parse is ambiguous, events trigger for
B<all> the possible symbols.
A user thinking in terms of one of the parses,
and unaware of the ambiguity, may find this unexpected.
In L<one of the examples|/"A messy example">,
events for both the symbols C<< <ambig1> >>
and C<< <ambig2> >>, as well as all their
derived symbols, trigger.

=head2 Tentative events

Marpa's events are left-eidetic but right-blind.
Left of the event location, Marpa's events are 100% accurate.
Right of the event location, they are totally unaware of
what the actual input will be --
there is no "lookahead".
Because events trigger based on input action
only up to the event location,
events are B<tentative>.

Once the parse is complete,
and the actual input to the right of the event
location is taken into account,
it is quite possible that
none of the parse trees
will actually contain the symbol instance
that triggered an event.

In L<one of the examples|/"A messy example">,
prediction and completion
events are reported for the symbols
C<< <start1> >>,
C<< <start2> >>,
C<< <mid1> >> and
C<< <mid2> >>,
but none of these symbols
winds up in
any of the parse tress.
This is because they are derived from
C<< <ambig1> >> or
C<< <ambig2> >>,
and C<< <ambig1> >> or
C<< <ambig2> >> will never be fully recognized
in any of the parse trees.
The reason that
C<< <ambig1> >> and
C<< <ambig2> >> will not be fully recognized
is that their full recognition requires
that there be a
C<< <z> >> symbol in the input and the
input stream in the example does not contain a
C<< <z> >> symbol.

Exhaustion events are not tentative.
All other parse events are tentative.

In the example, the predictions
for
C<< <mid1> >> and
C<< <mid2> >> do not match anything in
the final parse tree,
because the locations where
C<< <mid1> >> and
C<< <mid2> >> would be predicted are not reached in
those trees.
For similar reasons, nulling events are tentative.

Lexemes can be ambiguous and
when they are ambiguous
one or more of the lexeme alternatives
may not be used in any final parse tree.
Because of this,
lexeme events are also tentative.

After rejection events,
input can be,
and typically is,
retried at the same G1 location.
This is what happens when the Ruby Slippers technique
is used.
Often, on the second or later attempt, one or
more lexemes are found that are acceptable
to the grammar.
For this reason,
rejection events are tentative.

=head2 Nulled forests

When a symbol is nulled, any symbol which can be null-derived
from it
is also nulled.
In L<one of the examples|/"A messy example">,
when the
symbol C<< <g> >> is nulled,
derived symbols
C<< <g1> >>,
C<< <g2> >>,
C<< <g3> >>,
C<< <g4> >>
are also nulled.

Note that what was said about
L<ambiguity|"Ambiguity">
applies here.
In the example, the symbols
C<< <g1> >> and
C<< <g2> >> are in one derivation,
while C<< <g3> >> and
C<< <g4> >> are in another,
so that not just a parse tree,
but an entire parse forest
is nulled.
(Pedantically, a nulled tree is a forest
of a single tree.)

More precisely,

=over 4

=item *

If the grammar allows
any derivation of the symbol
I<Y> from I<X> in which I<X> and I<Y> are both
nulled; and

=item *

a nulling parse event
is declared for I<Y> and activated; and

=item *

a nulled instance of I<X> is encountered
in the parse at location I<L>; then

=item *

a nulling parse event for I<Y>
will trigger at location I<L>.

=back

=head2 Events and instances

As stated above, only nulling instances generate nulling events,
and only non-nulled symbols generate prediction events
and completion events.
Since lexemes cannot be zero length, this means that,
for a given symbol instance,
nulling events and all other events,
are mutually exclusive.
In other words, if a nulling event occurs for an
instance, no other event will trigger for that instance.

Some cases may seem to violate this rule.
For example
at position 23
in the parse in
L<the code below|/"A messy example">,
we have four events
of four different types,
all for the symbol C<< <e> >>.
In addition to
a nulling event, there is
a post-lexeme event,
a prediction event
and a completion event:

=for Marpa::R3::Display
name: Event synopsis
normalize-whitespace: 1
partial: 1

    Events at position 23: "e" ^e ^f e$ e[]

=for Marpa::R3::Display::End

The reason for this is that these events are
for three different symbol instances, all of which
share the same trigger location:

=over 4

=item 1

A nulled instance at location 23.

=item 2

A potential non-nulled instance, which may begin
at location 23.

=item 3

A non-nulled instance, which begins at location 22
and ends at location 23.

=back

The prediction of the second instance is, in fact,
fulfilled, as reported at location 25:

=for Marpa::R3::Display
name: Event synopsis
normalize-whitespace: 1
partial: 1

    Events at position 25: "e" ^f e$

=for Marpa::R3::Display::End

The second instance is length 1 and predicted at location
23, but its completion is reported at location 25.
This is because whitespace delayed its start by one position.

=for Marpa::R3::Display
name: Event synopsis
normalize-whitespace: 1
partial: 1

    Events at position 21: ^e ^f d$ e[] mid1$ mid2$

=for Marpa::R3::Display::End

The third instance is reported as predicted at position 21,
even though it actually begins at position 22.
The delayed start is
because of whitespace.

Prediction and completion events exclude
nulled symbols,
because there is no practical distinction between predicting
a nulled symbol, and actually seeing one.
This means that the prediction and completion of a nulled symbol
would always occur together.
This very special nature of nulled symbols motivates their
treatment as a special case.

=head1 Examples

=head2 Grammar 1

This grammar will be used in the next few examples.

=for Marpa::R3::Display
name: event examples; grammar 1
normalize-whitespace: 1

    my $dsl1 = <<'END_OF_DSL';
        top ::= A B C
        A ::= 'a'
        B ::= 'b'
        C ::= 'c'
        event A = completed A
        event B = completed B
        event C = completed C
        :discard ~ ws
        ws ~ [\s]+
    END_OF_DSL

    my $grammar1 = Marpa::R3::Grammar->new( { source => \$dsl1 } );

=for Marpa::R3::Display::End

=head2 Basic example

=for Marpa::R3::Display
name: event examples - basic
normalize-whitespace: 1

    @results = ();
    $recce   = Marpa::R3::Recognizer->new(
        {
            grammar        => $grammar1,
            event_handlers => {
                A => sub () { push @results, 'A'; 'ok' },
                B => sub () { push @results, 'B'; 'ok' },
                C => sub () { push @results, 'C'; 'ok' },
            }
        }
    );

=for Marpa::R3::Display::End

=head2 A default event handler

=for Marpa::R3::Display
name: event examples - default
normalize-whitespace: 1

    @results = ();
    $recce = Marpa::R3::Recognizer->new(
        {
            grammar        => $grammar1,
            event_handlers => {
                "'default" => sub () {
                    my ( $slr, $event_name ) = @_;
                    push @results, $event_name;
                    'ok';
                },
            }
        }
    );

=for Marpa::R3::Display::End

=head2 Using both default and explicit handlers

The next example processes event "C<A>" with
an explicit handler,
and leaves the others to a default handler.

=for Marpa::R3::Display
name: event examples - default and explicit
normalize-whitespace: 1

    @results = ();
    $recce = Marpa::R3::Recognizer->new(
        {
            grammar        => $grammar1,
            event_handlers => {
                A => sub () { push @results, 'A'; 'ok' },
                "'default" => sub () {
                    my ( $slr, $event_name ) = @_;
                    push @results, "!A=$event_name";
                    'ok';
                },
            }
        }
    );

=for Marpa::R3::Display::End

=head2 Exhaustion and rejection events

The next example shows how exhaustion and
rejection events can be handled.
Note that, in this example, the event
handler pauses the recognizer so that
it can process these events outside the
recognizer, perhaps ending or abending
the parse.

=for Marpa::R3::Display
name: event examples - rejected and exhausted
normalize-whitespace: 1

    my $dsl2 = <<'END_OF_DSL';
            top ::= A B C
            A ::= 'a'
            B ::= 'b'
            C ::= 'c'
            :discard ~ ws
            ws ~ [\s]+
    END_OF_DSL

    my $grammar2 = Marpa::R3::Grammar->new(
        {
            source => \$dsl2,
            rejection => 'event',
            exhaustion => 'event',
        },
    );

    @results = ();
    $recce = Marpa::R3::Recognizer->new(
        {
            grammar        => $grammar2,
            event_handlers => {
                "'rejected" => sub () { @results = ('rejected'); 'pause' },
                "'exhausted" => sub () { @results = ('exhausted'); 'pause' },
            }
        }
    );

=for Marpa::R3::Display::End

=head2 Events with associated data

The next two examples show how event handlers
access data.
This example shows an event which has event data
directly associated with it.
This data is passed in the arguments to the handler.

=for Marpa::R3::Display
name: event examples - event with data
normalize-whitespace: 1

    my $dsl3 = <<'END_OF_DSL';
            top ::= A B C
            A ~ 'a' B ~ 'b' C ~ 'c'
            :lexeme ~ <A> pause => after event => 'A'
            :lexeme ~ <B> pause => after event => 'B'
            :lexeme ~ <C> pause => after event => 'C'
            :discard ~ ws
            ws ~ [\s]+
    END_OF_DSL

    my $grammar3 = Marpa::R3::Grammar->new(
        {
            source => \$dsl3,
            rejection => 'event',
            exhaustion => 'event',
        },
    );

    @results = ();
    $recce = Marpa::R3::Recognizer->new(
        {
            grammar        => $grammar3,
            event_handlers => {
                "'default" => sub () {
                    my ( $slr, $event_name, $symid, $block_ix, $start, $length ) = @_;
                    my $symbol_name = $grammar3->symbol_name($symid);
                    push @results,
                        "event $event_name for symbol $symbol_name; block $block_ix, location $start, length=$length";
                    'ok';
                },
            }
        }
    );

=for Marpa::R3::Display::End

=head2 Handlers requiring local and non-local data.

We've already shown handlers which use global data.
We took advantage of the fact that handlers, like all
Perl functions, are closures.

The next example shows the use of a "factory" for making
handlers.
This is the most powerful and general method,
capable of taking
data that is only available
in several widely dispersed scopes,
and making it available
to the handler at event processing time.

The anonymous handler returned
by C<factory()> requires a global, a non-local
and a local variable to do its job.
Call this anonymous handler, I<anon>.
The global C<$A_global> is accessible to I<anon>
at event processing time because I<anon> is a
closure.

The variable C<$B_non_local> is not a global --
it is local to
C<factory()>.
But C<$B_non_local> is non-local to
I<anon>.
Nonetheless,
because I<anon> is a closure,
C<$B_non_local> is available to I<anon>
at event processing time.

Finally, C<$C_local> is not available
when I<anon> is defined,
but is local in the scope
in which I<anon> is instantiated.
C<$C_local> is made available
to anon by passing it
as an argument
to C<factory()>.

=for Marpa::R3::Display
name: event examples - data, using factory
normalize-whitespace: 1

    @results = ();
    my $A_global = 'A';

    sub factory {
        my ($local_arg) = @_;
        my $B_non_local = 'B';
        return sub () {
           my ($slr, $event_name) = @_;
           my $result;
           $result = $A_global if $event_name eq 'A';
           $result = $B_non_local if $event_name eq 'B';
           $result = $local_arg if $event_name eq 'C';
           push @results, $result;
           'ok';
        }
    }


    sub example_closure {
        my $C_local = 'C';
        return Marpa::R3::Recognizer->new(
            {
                grammar        => $grammar1,
                event_handlers => {
                    "'default" => factory($C_local),
                }
            }
        );
    }

=for Marpa::R3::Display::End

=head2 Per-location event processing, using pause

Multiple events can occur at one parse location.
Call this parse location I<pos>.
Callback handlers easily handle situations where the
events are indifferent to whether they occur together
at I<pos> or not.
But, for many apps, it is very important
to be able to handle all the events occuring
at I<pos> as a group.

There are two ways to do this.

=over 4

=item * B<Pause> method:
Return C<'pause'> from any of the event
handlers at I<pos>.
The recognizer will return control to the app
once all processing at I<pos> is complete,
and the app can do as it wishes.

=item * B<AoA> method:
Store the data for the event in
an AoA (array of arrays) indexed by I<pos>,
and process the events later,
from the AoA.

=back

The "pause" method is the most general.
The next example illustrates it.

=for Marpa::R3::Display
name: event examples - per-location using pause
normalize-whitespace: 1

    my $dsl4 = <<'END_OF_DSL';
            top ::= A B C
            A ::= 'a' B ::= 'b' C ::= 'c'
            event '^A' = predicted A
            event 'A$' = completed A
            event '^B' = predicted B
            event 'B$' = completed B
            event '^C' = predicted C
            event 'C$' = completed C
            :discard ~ ws
            ws ~ [\s]+
    END_OF_DSL

    my $grammar4 = Marpa::R3::Grammar->new(
        {
            source => \$dsl4,
        },
    );

    @results = ();
    $recce = Marpa::R3::Recognizer->new(
        {
            grammar        => $grammar4,
            event_handlers => {
                "'default" => sub () {
                    my ( $slr, $event_name ) = @_;
                    push @results, $event_name;
                    'pause';
                },
            }
        }
    );

=for Marpa::R3::Display::End

=head2 Per-location event processing, using an AoA

This example
illustrates the AoA (Array Of Arrays)
method of handling events
that need to be processed in sets, grouped
by trigger location.
The events are gathered into an array of arrays,
and processed when the recognizer is finished.

The previous example illustrated the more powerful
and general "pause" method.
The AoA method can be more elegant,
and is powerful enough for many cases.

=for Marpa::R3::Display
name: event examples - per-location processing, using AoA
normalize-whitespace: 1

    @results = ();
    $recce = Marpa::R3::Recognizer->new(
        {
            grammar        => $grammar4,
            event_handlers => {
                "'default" => sub () {
                    my ( $slr, @event_data ) = @_;
                    my (undef, $pos) = $slr->block_progress();
                    $pos //= 0;
                    push @{$results[$pos]}, \@event_data;
                    'ok';
                },
            }
        }
    );

=for Marpa::R3::Display::End

=head2 A messy example

The Marpa::R3 DSL script in this example
is designed to
include the unusual
cases described in this document.
It is also a second example of the "pause"
method for per-location processing.

These "corner" cases are unlikely to occur
all in a single app.
Hopefully
this grammar
will not resemble any grammar that you
encounter in practice.

=for Marpa::R3::Display
name: Event synopsis
normalize-whitespace: 1

    sub forty_two { return 42; };

    use Marpa::R3;

    my $dsl = <<'END_OF_DSL';

    test ::= a b c d e e f g h action => main::forty_two
        | a ambig1 | a ambig2
    e ::= <real e> | <null e>
    <null e> ::=
    g ::= g1 | g3
    g1 ::= g2
    g2 ::=
    g3 ::= g4
    g4 ::=
    d ::= <real d> | <insert d>
    ambig1 ::= start1 mid1 z
    ambig2 ::= start2 mid2 z
    start1 ::= b  mid1 ::= c d
    start2 ::= b c  mid2 ::= d

    a ~ 'a' b ~ 'b' c ~ 'c'
    <real d> ~ 'd'
    <insert d> ~ ["] 'insert d here' ["]
    <real e> ~ 'e'
    f ~ 'f'
    h ~ 'h'
    z ~ 'z'

    :lexeme ~ <a> pause => after event => '"a"'
    :lexeme ~ <b> pause => after event => '"b"'
    :lexeme ~ <c> pause => after event => '"c"'
    :lexeme ~ <real d> pause => after event => '"d"'
    :lexeme ~ <insert d> pause => before event => 'insert d'
    :lexeme ~ <real e> pause => after event => '"e"'
    :lexeme ~ <f> pause => after event => '"f"'
    :lexeme ~ <h> pause => after event => '"h"'

    event '^test' = predicted test
    event 'test$' = completed test
    event '^start1' = predicted start1
    event 'start1$' = completed start1
    event '^start2' = predicted start2
    event 'start2$' = completed start2
    event '^mid1' = predicted mid1
    event 'mid1$' = completed mid1
    event '^mid2' = predicted mid2
    event 'mid2$' = completed mid2

    event '^a' = predicted a
    event '^b' = predicted b
    event '^c' = predicted c
    event 'd[]' = nulled d
    event 'd$' = completed d
    event '^d' = predicted d
    event '^e' = predicted e
    event 'e[]' = nulled e
    event 'e$' = completed e
    event '^f' = predicted f
    event 'g[]' = nulled g
    event '^g' = predicted g
    event 'g$' = completed g
    event 'g1[]' = nulled g1
    event 'g2[]' = nulled g2
    event 'g3[]' = nulled g3
    event 'g4[]' = nulled g4
    event '^h' = predicted h

    :discard ~ whitespace
    whitespace ~ [\s]+
    END_OF_DSL

    my $grammar = Marpa::R3::Grammar->new(
        {
            source            => \$dsl,
            semantics_package => 'My_Actions'
        }
    );

    my @events = ();
    my @event_history = ();
    my $next_lexeme_length = undef;

    my $slr = Marpa::R3::Recognizer->new( { grammar => $grammar,
        event_handlers => {
            "'default" => sub () {
                my ($slr, $event_name, undef, undef, undef, $length) = @_;
                if ($event_name eq 'insert d') {
                   $next_lexeme_length = $length;
                }
                push @events, $event_name;
                'pause';
            }
        }
    } );

    push @event_history, join q{ }, "Events at position 0:", sort @events
       if @events;
    @events = ();

    my $input = q{a b c "insert d here" e e f h};
    my $length = length $input;
    my $pos    = $slr->read( \$input );

    READ: while (1) {

        push @event_history, join q{ }, "Events at position $pos:", sort @events
           if @events;
        @events = ();

        if (defined $next_lexeme_length) {
            $slr->lexeme_read_literal('real d', undef, undef, $next_lexeme_length);
            (undef, $pos) = $slr->block_progress();
            $next_lexeme_length = undef;
            next READ;
        }
        if ($pos < $length) {
            $pos = $slr->resume();
            next READ;
        }
        last READ;
    } ## end READ: while (1)

    my $expected_events = <<'=== EOS ===';
    Events at position 0: ^a ^test
    Events at position 1: "a" ^b ^start1 ^start2
    Events at position 3: "b" ^c ^mid1 start1$
    Events at position 5: "c" ^d ^mid2 start2$
    Events at position 6: insert d
    Events at position 21: ^e ^f d$ e[] mid1$ mid2$
    Events at position 23: "e" ^e ^f e$ e[]
    Events at position 25: "e" ^f e$
    Events at position 27: "f" ^h g1[] g2[] g3[] g4[] g[]
    Events at position 29: "h" test$
    === EOS ===

=for Marpa::R3::Display::End

=head1 Details

This section contains additional explanations, not essential to understanding
the rest of this document.
Often they are formal or mathematical.
While some people find these helpful, others find them distracting,
which is why
they are segregated here.

=head2 Terminology

The following are more formal definitions of terms
previously defined in intuitive terms.

B<Actual input>:
Let C<A> be the actual input to a parse.
Let C<prefix(A, L)> be the prefix of C<A> of length I<L>.
Let C<S> be an arbitrary string in the same alphabet as C<A>.
Let C<prefix(S, L)> be the prefix of C<S> of length I<L>.
If C<prefix(A, L) = prefix(S, L)>,
then we say that
string C<S> is B<consistent with
the actual input up to location L>.
For brevity,
we say
that C<S> is B<actual to L>.

B<Acceptable>:
if an instance of symbol I<< <S> >>
that starts at location I<L> is in a valid parse
of some string actual to I<L>,
we say that a symbol I<< <S> >> is B<acceptable> at
a location I<L>.
To say that a symbol
is B<acceptable> at
a location I<L> is to say that it is the symbol of
a symbol instance acceptable at location I<L>.

B<Recognized>:
if an instance of symbol I<< <S> >>
that ends at location I<L> is in a valid parse
of some string actual to I<L>,
we say that the instance of symbol I<S> is B<recognized> at
a location I<L>.
We say that a symbol
is B<recognized> at
a location I<L> if it is the symbol of
a symbol instance recognized at location I<L>.

=head1 COPYRIGHT AND LICENSE

=for Marpa::R3::Display
ignore: 1

  Marpa::R3 is Copyright (C) 2018, Jeffrey Kegler.

  This module is free software; you can redistribute it and/or modify it
  under the same terms as Perl 5.10.1. For more details, see the full text
  of the licenses in the directory LICENSES.

  This program is distributed in the hope that it will be
  useful, but without any warranty; without even the implied
  warranty of merchantability or fitness for a particular purpose.

=for Marpa::R3::Display::End

=cut

# vim: expandtab shiftwidth=4:
	Global
`s`	Focus search bar
`?`	Bring up this help dialog
	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)
	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse
	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)