#LyX 1.6.4 created this file. For more info see http://www.lyx.org/
\lyxformat 345
\begin_document
\begin_header
\textclass article
\begin_preamble
% Copyright 2012 Jeffrey Kegler
% This document is not part of the Marpa or Marpa::XS source.
% Although it may be included with a Marpa distribution that
% is under an open source license, this document is
% not under that open source license.
% Jeffrey Kegler retains full rights.
\end_preamble
\use_default_options false
\language english
\inputencoding auto
\font_roman default
\font_sans default
\font_typewriter default
\font_default_family default
\font_sc false
\font_osf false
\font_sf_scale 100
\font_tt_scale 100
\graphics default
\paperfontsize 12
\spacing single
\use_hyperref false
\papersize default
\use_geometry true
\use_amsmath 1
\use_esint 1
\cite_engine basic
\use_bibtopic false
\paperorientation portrait
\leftmargin 0.75in
\topmargin 0.75in
\rightmargin 0.75in
\bottommargin 0.75in
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\defskip medskip
\quotes_language english
\papercolumns 1
\papersides 1
\paperpagestyle default
\tracking_changes false
\output_changes false
\author ""
\author ""
\end_header
\begin_body
\begin_layout Title
Marpa Correctness Proof
\end_layout
\begin_layout Author
Jeffrey Kegler
\end_layout
\begin_layout Abstract
This is a
\begin_inset Quotes eld
\end_inset
correctness
\begin_inset Quotes erd
\end_inset
proof for the Marpa algorithm.
\begin_inset Quotes eld
\end_inset
Correctness
\begin_inset Quotes erd
\end_inset
is shown by demonstrating equivalence to the Aycock & Horspool's modification
of Earley's algorithm in their 2002 paper.
\end_layout
\begin_layout Abstract
\begin_inset CommandInset toc
LatexCommand tableofcontents
\end_inset
\end_layout
\begin_layout Section
THIS IS A DRAFT
\end_layout
\begin_layout Standard
This paper is in late stages, but not yet finished.
It will contain many errors of the kind that the last drafts should remove.
\end_layout
\begin_layout Section
Introduction
\end_layout
\begin_layout Standard
The paper proves the equivalence of Marpa to Aycock and Horspool's 2002
revision of Earley's algorithm (AH2002).
The intent is to establish confidence that the Marpa algorithm is correct.
The reader is assumed to be familiar with standard grammar notation, LR(0)
automata, Earley's algorithm and Aycock and Horspool's 2002 paper
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
Aycock and Horspool 2002
\end_layout
\end_inset
.
\end_layout
\begin_layout Section
The Two Algorithms
\end_layout
\begin_layout Standard
In this paper, the logic of both the Marpa and the AH2002 algorithm is divided
among five sections
\end_layout
\begin_layout Itemize
\family typewriter
Initializer
\end_layout
\begin_layout Itemize
\family typewriter
Recognizer
\end_layout
\begin_layout Itemize
\family typewriter
Earley Set Processing
\end_layout
\begin_layout Itemize
\family typewriter
Scanner
\end_layout
\begin_layout Itemize
\family typewriter
Completer
\end_layout
\begin_layout Standard
The
\family typewriter
Initializer
\family default
,
\family typewriter
Recognizer
\family default
,
\family typewriter
Scanner
\family default
and
\family typewriter
Completer
\family default
are the same for both algorithms.
\family typewriter
Marpa
\family default
and
\family typewriter
AH2002
\family default
differ only in their
\family typewriter
Earley Set Processing
\family default
.
\end_layout
\begin_layout Standard
Pseudocode is provided for
\family typewriter
Recognizer
\family default
,
\family typewriter
Scanner
\family default
,
\family typewriter
Completer
\family default
and each algorithm's
\family typewriter
Earley Set Processing
\family default
.
All processing is deterministic.
Variables are either global to the algorithm or local to their section.
\end_layout
\begin_layout Standard
Sections
\begin_inset Quotes eld
\end_inset
call
\begin_inset Quotes erd
\end_inset
other sections.
When called, some sections take arguments, which become input variables
in the called section.
Input variables are local to the section, and their semantics is call-by-value.
Despite the similarity to the semantics of subroutines in certain languages,
no implementation is implied.
Sections do not return a value.
\end_layout
\begin_layout Section
Earley Sets
\end_layout
\begin_layout Standard
Each Earley set is an ordered list of Earley items.
In AH2002 and Marpa, every Earley item is a
\family typewriter
(state, parent)
\family default
duple, where
\family typewriter
state
\family default
is an AHFA state.
AHFA states are states of the split
\begin_inset Formula $\epsilon$
\end_inset
-DFA automatons first described by Aycock and Horspool.
In this document a split
\begin_inset Formula $\epsilon$
\end_inset
-DFA automaton is called an AHFA (Aycock-Horspool Finite Automaton) and
its states are called AHFA states.
Details of AHFA's and their states are not relevant to this proof and may
be found in Aycock and Horspool 2002.
\end_layout
\begin_layout Standard
In the
\family typewriter
(state, parent)
\family default
Earley item duple,
\family typewriter
parent
\family default
is a parent pointer.
A parent pointer is a pointer to an Earley set, possibly the same one of
which the Earley item is a member.
\end_layout
\begin_layout Subsection
Worklists
\end_layout
\begin_layout Standard
In both the AH2002 and Marpa algorithms, the Earley sets need to be arrays
that obey the semantics of worklists.
Worklists are arrays which contain work items.
Work items in a work list never disappear, unless the entire work list
goes out of scope.
Worklists only change in one way -- by the addition of new work items at
the end of the work list.
\end_layout
\begin_layout Standard
A dynamic array can be implemented so that is never changed except to grow
it at the end.
Such an array would provide worklist semantics.
\end_layout
\begin_layout Subsection
AHFA States Versus LR(0) Items
\end_layout
\begin_layout Standard
The most visible difference between AH2002 and the standard Earley algorithm
is in the form of the Earley items.
In both AH2002 and the standard Earley algorithm, Earley items are duples.
Also, in both AH2002 and the standard Earley algorithm, the second element
of the duple is a parent pointer.
\end_layout
\begin_layout Standard
But the first element of the duple in AH2002 is not the same as in the standard
Earley algorithm.
In the standard Earley algorithm, the first element of the duple is an
LR(0)
\series bold
item
\series default
.
In the AH2002 algorithm the first element of the Earley item is an AHFA
\series bold
state
\series default
.
An AHFA state is a
\series bold
set
\series default
of one or more LR(0) items.
\end_layout
\begin_layout Standard
Working with sets of LR(0) items instead of single items makes the AH2002
algorithm more complex, but it also makes it much more efficient in practice.
Earley items in the Marpa algorithm are the same as in the AH2002 algorithm
-- duples of AHFA state and parent pointer.
\end_layout
\begin_layout Section
Initializer
\end_layout
\begin_layout Standard
In both Marpa and AH2002, the Initializer accepts a token stream and a grammar
as input.
The token stream is an array of
\family typewriter
n
\family default
input tokens
\family roman
\series medium
\shape up
\size normal
\emph off
\bar no
\noun off
\color none
,
\family typewriter
\series default
\shape default
\size default
\emph default
\bar default
\noun default
\color inherit
x[1], x[2], ..., x[n]
\family default
.
\end_layout
\begin_layout Standard
In each algorithm, the Recognizer Main Processing initializes
\family typewriter
S[0]
\family default
with Earley items of the form
\family typewriter
(start_state, 0)
\family default
where
\family typewriter
start_state
\family default
is one of the AHFA start states and the parent pointer points to Earley
set 0.
There will be one or two start states, depending on the grammar.
\end_layout
\begin_layout Subsection
Pure Functions of the Grammar
\end_layout
\begin_layout Standard
The Initializer creates the functions
\family typewriter
COMPLETED
\family default
,
\family typewriter
LHS
\family default
and
\family typewriter
GOTO
\family default
.
These are pure functions which depend only on the grammar.
These are partial functions with a finite and, in fact, very limited domain.
Since these functions depend only on the grammar, they may be precomputed.
Since they are only defined for a fixed list of arguments, these functions
may be implemented as table lookups.
\end_layout
\begin_layout Standard
\family typewriter
LHS
\family default
is a function from a rule to its left hand side symbol.
\family typewriter
LHS
\family default
is defined for every rule of the grammar.
\end_layout
\begin_layout Standard
\family typewriter
COMPLETED
\family default
is a function from an AHFA state to a list of the rules which have completed
LR(0) items in that AHFA state.
A completed LR(0) item is one with the dot at the end.
The list of rules returned by
\family typewriter
COMPLETED
\family default
may be empty.
\end_layout
\begin_layout Standard
\family typewriter
GOTO
\family default
is a transition function from an AHFA state and a symbol (which may be
undefined) to another AHFA state.
Most AHFA states only have transitions on a few of the grammar's symbols,
so for many combinations of
\family typewriter
state
\family default
and
\family typewriter
symbol
\family default
, the result of
\family typewriter
GOTO(state, symbol)
\family default
will be undefined.
\end_layout
\begin_layout Section
Recognizer
\end_layout
\begin_layout Standard
The Recognizer builds an array of
\family typewriter
n+1
\family default
Earley sets,
\family typewriter
S[0], S[1], ..., S[SIZE(x)]
\family default
, where
\family typewriter
SIZE(x)
\family default
is the size of the token stream array,
\family typewriter
x
\family default
.
In both algorithms, the Earley sets need to be implemented so they can
be treated as worklists.
\end_layout
\begin_layout Standard
In both Marpa and AH2002, the Recognizer Main Processing then does the Earley
Set Processing.
The Earley Set Processing is performed for each of the Earley sets from
0 to
\family typewriter
SIZE(x)
\family default
, in order.
\end_layout
\begin_layout Standard
\begin_inset Float algorithm
placement h
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
Recognizer Loop: for every (i) in (0 ..
SIZE(x)) {
\end_layout
\begin_layout Plain Layout
do Earley Set i Processing
\end_layout
\begin_layout Plain Layout
}
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
Recognizer
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Section
The AH2002 Algorithm
\end_layout
\begin_layout Standard
In the proofs, it is convenient to refer to a
\family typewriter
Scanner Loop
\family default
and a
\family typewriter
Completer Loop
\family default
for both
\family typewriter
AH2002
\family default
and
\family typewriter
Marpa
\family default
.
\family typewriter
Marpa
\family default
has separate Scanner and Completer Loops defined in its Earley Set Processing
pseudocode.
For
\family typewriter
AH2002
\family default
, both the
\family typewriter
Scanner Loop
\family default
and the
\family typewriter
Completer Loop
\family default
is defined as equivalent to the
\family typewriter
Combined Loop
\family default
.
\end_layout
\begin_layout Standard
\noun on
Definition
\noun default
:
\family typewriter
Scanner Loop @ AH2002 @ Earley Set i Processing
\family default
is defined as
\family typewriter
Combined Loop @ AH2002 @ Earley Set i Processing
\family default
.
\end_layout
\begin_layout Standard
\noun on
Definition
\noun default
:
\family typewriter
Completer Loop @ AH2002 @ Earley Set i Processing
\family default
is defined as
\family typewriter
Combined Loop @ AH2002 @ Earley Set i Processing
\family default
.
\end_layout
\begin_layout Standard
\begin_inset Float algorithm
placement h
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
Input variable: i
\end_layout
\begin_layout Plain Layout
\end_layout
\begin_layout Plain Layout
Combined Loop: for every (state, parent) in S[i] {
\end_layout
\begin_layout Plain Layout
do Scanner( state, parent )
\end_layout
\begin_layout Plain Layout
if parent != i {
\end_layout
\begin_layout Plain Layout
do Completer( state, parent )
\end_layout
\begin_layout Plain Layout
}
\end_layout
\begin_layout Plain Layout
}
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
AH2002 Earley Set Processing
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Section
The Marpa Algorithm
\end_layout
\begin_layout Standard
The
\family typewriter
Marpa
\family default
algorithm is the
\family typewriter
AH2002
\family default
algorithm
\begin_inset Quotes eld
\end_inset
upside-down and inside-out
\begin_inset Quotes erd
\end_inset
.
It pulls the scanner and the completer out of the main loop and gives each
its own loop.
And
\family typewriter
Marpa
\family default
reverses the order -- the completer is called first to process all items
in an Earley set, including those items added by the completer itself.
After the completer is finished, the scanner is called to process all items
in the Earley set.
The scanner never adds items to the current Earley set, so the current
Earley set,
\family typewriter
S[i]
\family default
remains constant in
\family typewriter
Scanner Loop
\family default
.
\end_layout
\begin_layout Standard
\begin_inset Float algorithm
placement h
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
Input variable: i
\end_layout
\begin_layout Plain Layout
\end_layout
\begin_layout Plain Layout
Completer Loop: for every (state, parent) in S[i] {
\end_layout
\begin_layout Plain Layout
if parent != i {
\end_layout
\begin_layout Plain Layout
do Completer(state, parent)
\end_layout
\begin_layout Plain Layout
}
\end_layout
\begin_layout Plain Layout
} # End of Completer Loop
\end_layout
\begin_layout Plain Layout
\end_layout
\begin_layout Plain Layout
Scanner Loop: for every (state, parent) in S[i] {
\end_layout
\begin_layout Plain Layout
do Scanner(state, parent)
\end_layout
\begin_layout Plain Layout
} # End of Scanner Loop
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
Marpa Earley Set Processing
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Section
Scanner
\end_layout
\begin_layout Standard
The
\family typewriter
Scanner
\family default
takes three input variables.
The first two are the elements of an Earley item:
\family typewriter
state
\family default
is an LR(0) state, and
\family typewriter
parent
\family default
is the index of an Earley set.
The third input variable, i, is the index of the current Earley set.
\end_layout
\begin_layout Standard
\family typewriter
k
\family default
and
\family typewriter
nk
\family default
are temporaries, local to the subroutine, which hold AHFA state values
returned by
\family typewriter
GOTO
\family default
.
\family typewriter
S
\family default
is a global array containing the Earley sets.
\family typewriter
x
\family default
is global array containing the tokens.
\end_layout
\begin_layout Standard
\begin_inset Float algorithm
placement h
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
Input variables: state, parent, i
\end_layout
\begin_layout Plain Layout
\end_layout
\begin_layout Plain Layout
set k = GOTO(state, x[i+1])
\end_layout
\begin_layout Plain Layout
if state k is not defined {
\end_layout
\begin_layout Plain Layout
return
\end_layout
\begin_layout Plain Layout
}
\end_layout
\begin_layout Plain Layout
add (k, parent) to S[i+1] at end, if it is not already present
\end_layout
\begin_layout Plain Layout
set nk = GOTO(k, undefined)
\end_layout
\begin_layout Plain Layout
if state nk is not defined {
\end_layout
\begin_layout Plain Layout
return
\end_layout
\begin_layout Plain Layout
}
\end_layout
\begin_layout Plain Layout
add (nk, i+1) to S[i+1] at end, if it is not already present
\end_layout
\begin_layout Plain Layout
return
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
Scanner
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Section
Completer
\end_layout
\begin_layout Standard
The
\family typewriter
Completer
\family default
takes three input variables.
The first two are the elements of an Earley item:
\family typewriter
state
\family default
is an LR(0) state, and
\family typewriter
parent
\family default
is the index of an Earley set.
The third input variable, i, is the index of the current Earley set.
\end_layout
\begin_layout Standard
\family typewriter
k
\family default
and
\family typewriter
nk
\family default
are temporaries, local to the subroutine, which hold LR(0) state values
returned by
\family typewriter
GOTO
\family default
.
\family typewriter
rule
\family default
is a rule, returned by
\family typewriter
COMPLETED
\family default
.
\family typewriter
S
\family default
is a global array containing the Earley sets.
\end_layout
\begin_layout Standard
\begin_inset Float algorithm
placement h
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
Input variables: state, parent, i
\end_layout
\begin_layout Plain Layout
\end_layout
\begin_layout Plain Layout
foreach rule in COMPLETED(state) {
\end_layout
\begin_layout Plain Layout
foreach ( pstate, pparent ) in S[parent] {
\end_layout
\begin_layout Plain Layout
set k = GOTO( pstate, LHS(rule) )
\end_layout
\begin_layout Plain Layout
if k is defined {
\end_layout
\begin_layout Plain Layout
add ( k, pparent ) to S[i] at end, if it is not already present
\end_layout
\begin_layout Plain Layout
set nk = GOTO( k, undefined )
\end_layout
\begin_layout Plain Layout
if nk is defined {
\end_layout
\begin_layout Plain Layout
add ( nk, i ) to S[i] at end, if it is not already present
\end_layout
\begin_layout Plain Layout
}
\end_layout
\begin_layout Plain Layout
} # if k is defined
\end_layout
\begin_layout Plain Layout
} # foreach rule
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
Completer
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Section
The Proof: Short Version
\end_layout
\begin_layout Standard
After, Initialization, the only changes to globals are to the Earley sets,
and these occur in the Earley Set Processing.
The Processing of Earley set i can change S[i] (called the current Earley
set) and S[i+1] (the next Earley set).
The proof is organized around these facts.
\end_layout
\begin_layout Standard
First, two lemmas are proved showing that if, before each Earley Set Processing
step, the Earley sets in both algorithms are identical, then, after the
Earley Set Processing Step, the current Earley sets and the next Earley
sets remain identical .
\end_layout
\begin_layout Subsection
The Current Earley Set Lemma
\end_layout
\begin_layout Standard
\noun on
Statement of the Lemma
\noun default
: Assume that, before each Earley set is processed, the Earley sets in AH2002
are identical to the Earley sets in Marpa.
Then, after Earley Set i Processing, S[i] in AH2002 is identical to S[i]
in Marpa.
\end_layout
\begin_layout Standard
\noun on
Proof
\noun default
: In the Earley Set Processing, S[i] will only be changed by the Completer.
The Completer can be treated as a pure deterministic functions whose inputs
are its input variables (state, parent and i), the current Earley set itself
(S[i]) and the pure functions of the grammar (GOTO, LHS and COMPLETED).
GOTO, LHS, COMPLETED do not change after Initialization and the input variable
i will not change throughout the Earley Set Processing, so they may treated
as constants for this purpose.
\end_layout
\begin_layout Standard
The input variables (state, parent) and the current Earley set (S[i]) do
change but, in both AH2002 and Marpa, the calls to the Completer will be
with same values for these and in the same sequence.
This can be shown by induction on the calls to the Completer.
\end_layout
\begin_layout Standard
There may be zero such calls, and if this is the case, S[i] after Earley
Set i Processing will clearly be unchanged from S[i] before Earley Set
i Processing.
Assume there have been n calls to the completer.
By the induction hypothesis, after these n calls, S[i] is identical in
both algorithms.
The input variables state and parent depend only on S[i], and on i.
i was already shown to be constant during the Earley Set i Processing.
S[i] is identical in both algorithms by the induction hypothesis.
The Completer is the same in both algorithms.
Therefore, since the Completer and its inputs are identical, and the Completer
is deterministic, the results of the Completer will be the same in both
algorithms.
\end_layout
\begin_layout Standard
The induction above assumed that there is a n+1'th call to the Completer
in both algorithms.
This actually depends on the domain for the Completer Loop.
But this domain is also the current Earley set, S[i].
So by the same induction as above, it is not possible there will be a n+1'th
call to the Completer in one algorithm, without there being an n+1'th call
to the Completer in the other algorithm.
In other words, either there will be a n+1'st call to the Completer in
both algorithms, or the n'th call to the Completer in both algorithms will
have been the final one.
\end_layout
\begin_layout Standard
This completes the induction and shows that, after the Earley Set i Processing,
the current Earley sets (S[i]) in both algorithms must be identical to
each other.
QED.
\end_layout
\begin_layout Subsection
The Next Earley Set Lemma
\end_layout
\begin_layout Standard
\noun on
Statement of the Lemma
\noun default
: Assume that, before each Earley set is processed, the Earley sets in AH2002
are identical to the Earley sets in Marpa.
Then, after Earley Set i Processing, S[i+1] in AH2002 is identical to S[i+1]
in Marpa.
\end_layout
\begin_layout Standard
\noun on
Proof
\noun default
: First, recall the Current Earley Set Lemma.
By it, we know that
\series bold
after
\series default
Earley Set i Processing, the
\series bold
current
\series default
Earley set (S[i]) is identical in both AH2002 and Marpa.
We will make use of this fact below.
\end_layout
\begin_layout Standard
In the Earley Set Processing, S[i+1] will only be changed by the Scanner.
The Scanner can be treated as a pure deterministic function whose inputs
are its input variables (state, parent and i), the next Earley set itself
(S[i+1]), the token stream (x) and a pure function of the grammar (GOTO).
GOTO and x do not change after Initialization and the input variable i
will not change throughout the Earley Set Processing, so they may treated
as constants for this purpose.
\end_layout
\begin_layout Standard
The input variables (state, parent) and the next Earley set (S[i+1]) do
change but, in both AH2002 and Marpa, the calls to the Scanner will be
with same values for these, will be made in the same sequence, and will
have the same results.
This can be shown by induction on the calls to the Scanner.
\end_layout
\begin_layout Standard
There may be zero such calls, and if this is the case, S[i+1] after Earley
Set i Processing will clearly be unchanged from S[i+1] before Earley Set
i Processing.
Assume there have been n calls to the Scanner.
By the induction hypothesis, after these n calls, S[i+1] is identical in
both algorithms.
\end_layout
\begin_layout Standard
The input variables state and parent are the n+1'st work item in the current
Earley set S[i].
Call this work item S[i][n+1].
As a work item, once S[i][n+1] is added to a work list it is never deleted
and never changes.
Since by the Current Earley Set Lemma, S[i] in AH2002 will be identical
to S[i] in Marpa after Earley Set i Processing, if S[i][n+1] exists in
both algorithms it must be identical.
\end_layout
\begin_layout Standard
S[i+1] is identical in both algorithms by the induction hypothesis.
The Scanner is the same in both algorithms.
Therefore, since the Scanner and its inputs are identical, and the Scanner
is deterministic, the results of the Scanner will be the same in both algorithm
s.
\end_layout
\begin_layout Standard
The induction above assumed that there is a n+1'th call to the Scanner in
both algorithms.
This actually depends on the domain for the Scanner Loop.
This domain is the current Earley set, S[i].
\end_layout
\begin_layout Standard
But what if there was an S[i][n+1] in AH2002 but no S[i][n+1] in Marpa,
or vice versa? We can show this won't happen, by a reduction to absurdity.
Suppose, just after the n'th call to the Scanner, S[i][n+1] exists in AH2002,
but not in Marpa.
By the Current Earley Set Lemma S[i] in both algorithms is identical after
the Earley Set i Processing.
But if S[i][n+1] does not exist for the Scanner Loop in Marpa, examination
of the pseudocode will show that S[i][n+1] will not exist in Marpa at the
End of Earley Set i Processing either.
On the other hand, if S[i][n+1] exists at the End of Earley Set i Processing
in AH2002, it will still exist at the End of Earley Set i Processing in
AH2002.
This means that S[i] at the End of Earley Set i Processing in AH2002 will
not be the same as S[i] at the End of Earley Set i Processing in Marpa.
\end_layout
\begin_layout Standard
But, by the Current Earley Set Lemma, at the End of Earley Set i Processing,
S[i] in each of the two algorithms must be identical to S[i] in the other.
This means that the assumption for the reduction to absurdity in the paragraph
above must be false.
For the reduction to absurdity, we assumed that, just after the n'th call
to the Scanner, there was an S[i][n+1] in AH2002, but not in Marpa.
Therefore if, just after the n'th call to the Scanner, there is an S[i][n+1]
in the AH2002, there must also be one in Marpa.
\end_layout
\begin_layout Standard
The same reduction to absurdity works in reverse, and shows the if, just
after the n'th call to the Scanner, there is an S[i][n+1] in Marpa, there
must also be one in AH2002.
Putting the two together, just after the n'th call to the Scanner, there
is an S[i][n+1] in AH2002 if and only if there is an S[i][n+1] in Marpa.
\end_layout
\begin_layout Standard
This means that it is not possible there will be a n+1'th call to the Scanner
in one algorithm, without there being an n+1'st call to the Scanner in
the other algorithm.
In other words, either there will be a n+1'st call to the Scanner in both
algorithms, or the n'th call to the Scanner in both algorithms will have
been the final one.
\end_layout
\begin_layout Standard
This completes the induction and shows that, after the Earley Set i Processing,
the next Earley sets (S[i+1]) in both algorithms must be identical to each
other.
QED.
\end_layout
\begin_layout Subsection
Showing Equivalence
\end_layout
\begin_layout Standard
We say that AH2002 and Marpa are equivalent if, when they are run on the
same token streams and grammars, they produce the same Earley sets.
Given the Current Earley Set Lemma and the Next Earley Set Lemma, the proof
follows quickly.
The proof is by induction on the Earley Set Processing.
\end_layout
\begin_layout Standard
Initialization is the same in AH2002 and Marpa, and the input to the algorithms
(their tokens streams and grammars) are assumed to be identical.
Initialization is deterministic, so before the Processing of Earley Set
0, all globals are identical.
\end_layout
\begin_layout Standard
Suppose before the Earley Set n Processing, all globals are identical.
We know from examination of the pseudocode that, during Earley Set n Processing
, the only globals which change are S[n] and S[n+1].
\end_layout
\begin_layout Standard
We have from the induction hypothesis that, before Earley Set n Processing,
S[n] in AH2002 is identical is S[n] in Marpa.
From this and the Current Earley Set Lemma, we can conclude that, after
Earley Set n Processing, S[n] in each of the algorithms will be identical
to S[n] in the other algorithm.
\end_layout
\begin_layout Standard
We have from the induction hypothesis that, before Earley Set n Processing,
S[n+1] in AH2002 is identical is S[n+1] in Marpa.
From this, the Current Earley Set Lemma and the Next Earley Set Lemma,
we can conclude that, after Earley Set n Processing, S[n+1] in each of
the algorithms will be identical to S[n+1] in the other algorithm.
\end_layout
\begin_layout Standard
Above we have that:
\end_layout
\begin_layout Itemize
All globals remains constant during Earley Set Processing except S[n] and
S[n+1].
\end_layout
\begin_layout Itemize
After Earley Set n Processing, S[n] in each of the algorithms will be identical
to S[n] in the other algorithm.
\end_layout
\begin_layout Itemize
After Earley Set n Processing, S[n+1] in each of the algorithms will be
identical to S[n+1] in the other algorithm.
\end_layout
\begin_layout Standard
Combining these three, we have that if, after Earley Set n Processing, all
globals in AH2002 will be identical to the corresponding global in Marpa.
The point after Earley Set n Processing is the same as the point before
Earley Set n+1 Processing.
This proves the induction.
\end_layout
\begin_layout Standard
The Recognizer will process l+1 Earley sets, where l is the length of the
token stream.
From the induction, we know that, after the processing of the l+1 Earley
sets, the Earley sets in Marpa will be identical to those in AH2002.
This is the equivalence we were after.
\end_layout
\begin_layout Standard
QED.
\end_layout
\begin_layout Section
Why a Detailed Proof?
\end_layout
\begin_layout Standard
The rest of this paper repeats the proof already given, more carefully
and in more detail.
It describes some new notations, useful for detailed proofs of this kind.
Readers, if satisfied with what they have seen so far, will not necessarily
want to read any further.
\end_layout
\begin_layout Standard
The form of the short proof above is similar to that of the correctness
proofs as they usually appear in journals.
Proofs in this form are problematic.
English-language descriptions of program locations, even when carefully
written, are verbose.
Even so, these descriptions are always in danger of being ambiguous.
\end_layout
\begin_layout Standard
More importantly, the reader can not easily confirm that all crucial details
are being established.
For example, in this proof, the existence or non-existence of loop passes
and worklist items at certain points is as crucial as any issue in the
proof, but very easy to overlook.
\end_layout
\begin_layout Standard
Short, open-form, English-language proofs of this kind, while they dominate
the academic literature, are error-prone and hard to read.
In the journals, perhaps reasons of space force their acceptance.
Page-count in academic journals is a scarce resource, inelastic to the
demands of academics.
Detailed proofs would perhaps double the size of the corresponding articles.
Doubling the page count means halving the number of articles published.
Everyone in the field realizes that that one or more of the articles left
unpublished might well be his own.
\end_layout
\begin_layout Standard
Web space is not nearly so expensive as space in printed academic journals.
The rest of this paper contains the detailed working out of the correctness
proof just given in short form.
The short proof was written after, and based on, the detailed proof.
\end_layout
\begin_layout Section
Notation for the Detailed Proof
\end_layout
\begin_layout Subsection
Grafs
\end_layout
\begin_layout Standard
Most detailed proofs are in table form, with one equation per line.
Equations are right aligned and their explanations left aligned.
Program correctness proofs require a degree of explanation which easily
overwhelms the one-line-per-equation format.
\end_layout
\begin_layout Standard
This proof is organized with one equation per paragraph, usually abbreviated
\begin_inset Quotes eld
\end_inset
graf
\begin_inset Quotes erd
\end_inset
.
The first sentence is an equation, and the others justify its derivation
or provide explanations.
\end_layout
\begin_layout Subsection
Time-Sets
\end_layout
\begin_layout Standard
This proof I introduce time-sets notation.
Time-set notation can be looked at as an abbreviation of English, or as
a semi-formalized, notation.
I hope it is an improvement in precision.
I would also claim it is more readable than its expansion into ordinary
English would be.
\end_layout
\begin_layout Standard
A key to this notation is that the idea of sequence is an afterthought in
time-set notation.
Time-sets are unordered sets of point in the execution -- time-points.
Statements about their sequence -- which time-point is before what other
time-point, can obviously be important in analyzing programs.
But I think premature introduction of the sequencing into the semantics
is counterproductive.
\end_layout
\begin_layout Standard
The notation for locations in the code uses sets of time-points.
The points in time, or time-points, are points in the processing -- not
locations in the pseudocode.
Every operation of any kind separates time-points, so many time-points
can correspond to the same pseudo-code location.
Additionally, if a lexical location in the pseudocode is passed through
several times, as it would be in a loop, each pass creates a new set of
time-points for that location.
\end_layout
\begin_layout Standard
For time-points to be separate, some operation must happen between them.
This means that sometimes the same time-point can be identified with two
different lexical locations in the pseudocode, and have two different names.
This occurs frequently in loops.
More on this below.
\end_layout
\begin_layout Subsection
Names of Sets
\end_layout
\begin_layout Subsubsection
Algorithms
\end_layout
\begin_layout Standard
Each algorithm's name is used for the time-set of all the time-points in
that algorithm.
So AH2002 is the name of the time-set of all the time-points in the execution
of the AH2002 algorithm.
Marpa is the name of the time-set of all the time-points in the execution
of the Marpa algorithm.
Start of Marpa is the time point before any operations in the Marpa algorithm.
End of Marpa is the time-point after any operations.
\end_layout
\begin_layout Subsubsection
Pseudocode Sections
\end_layout
\begin_layout Standard
Each pseudocode section's name is used for the time-set of all time-points
in that pseudo-code section, including all time-points in either algorithm.
Recognizer is the name of the time-set of all time-points in the Recognizer,
whether for AH2002 or Marpa.
When it is useful to distinguish the @-notation can be used:
\family typewriter
Recognizer @ AH2002
\family default
or
\family typewriter
Recognizer @ Marpa
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
Start of Recognizer
\family default
is the set of the time-points before any operation in the Recognizers,
including both algorithms.
\family typewriter
End of Recognizer @ Marpa
\family default
is the singleton set containing the time-point after all operations in
the Marpa algorithm.
\end_layout
\begin_layout Standard
Pseudocode sections can be called multiple times.
For example,
\family typewriter
Scanner @ Marpa
\family default
refers to all time-points in any call of
\family typewriter
Scanner
\family default
in the
\family typewriter
Marpa
\family default
algorithm.
\end_layout
\begin_layout Standard
Where it is useful to specifically distinguish one call to a section, this
can be done in the time-set's name.
For example, the call the
\family typewriter
Earley Set Processing
\family default
in the
\family typewriter
AH2002
\family default
algorithm for Earley Set
\family typewriter
s
\family default
is
\family typewriter
Earley Set s Processing @ AH2002
\family default
.
\end_layout
\begin_layout Subsection
Loops
\end_layout
\begin_layout Standard
Loops have not only a
\family typewriter
Start
\family default
and
\family typewriter
End
\family default
, but a
\family typewriter
Top
\family default
and a
\family typewriter
Bottom
\family default
.
The
\family typewriter
Start of Loop L
\family default
is before execution of any instructions in the loop.
The
\family typewriter
End of Loop L
\family default
is after execution of any instructions.
The
\family typewriter
Top of Loop L(c)
\family default
is before the execution of pass
\family typewriter
c
\family default
through
\family typewriter
Loop L
\family default
.
The
\family typewriter
Bottom of Loop L(c)
\family default
is after the execution of pass
\family typewriter
c
\family default
through Loop
\family typewriter
L
\family default
.
\end_layout
\begin_layout Standard
Passes through loops are numbered starting at 0.
The final test which terminates a loop is considered a pass, but a pass
which does not execute any instructions inside the loop.
A loop over the numbers from 0 to 10 will have 12 passes, with pass 0 being
the first pass, and pass 11 being the last pass.
Pass 0 will have the loop variable with the value 0.
In pass n, the value of the loop variable will be n, unless n is the last
pass.
In the last pass, only the test is performed -- the fails, none of the
instructions inside the loop are executed, and the loop terminates.
Where z is the last pass, there is still considered to be a both
\family typewriter
Top of Loop(z)
\family default
and
\family typewriter
Bottom of Loop(z)
\family default
.
\end_layout
\begin_layout Standard
\noun on
Definition
\noun default
: Let L be the name of a loop.
\family typewriter
DOMAIN(L)
\family default
is the array over which loop
\family typewriter
L
\family default
iterates.
\end_layout
\begin_layout Standard
In this definition, it is important that L is not a timeset -- it is the
name of a loop.
Also, it is important to note the value of
\family typewriter
DOMAIN
\family default
can change if the loop is over a worklist.
\end_layout
\begin_layout Standard
\noun on
Definition
\noun default
: Let L be the name of a loop.
Then,
\family typewriter
PASS(p, L)
\family default
\begin_inset Formula $\Leftrightarrow$
\end_inset
\family typewriter
0 <= p <= SIZE(DOMAIN(L) @ End of L))
\family default
.
\end_layout
\begin_layout Standard
Informally,
\family typewriter
PASS(p,L)
\family default
is true if
\family typewriter
p
\family default
is the number of a valid pass of
\family typewriter
L
\family default
.
If
\family typewriter
D == DOMAIN(L)
\family default
, then the loop variable for pass
\family typewriter
p
\family default
is
\family typewriter
D[p]
\family default
.
\end_layout
\begin_layout Standard
For any loop
\family typewriter
L, PASS(p, L)
\family default
\begin_inset Formula $\wedge$
\end_inset
\family typewriter
PASS(p+1, L)
\family default
\begin_inset Formula $\Leftrightarrow$
\end_inset
\family typewriter
(Bottom(p) of L == Top(p+1) of L
\family default
).
\end_layout
\begin_layout Standard
\noun on
Definition
\noun default
:
\family typewriter
FINAL(L) == SIZE( DOMAIN(L) @ End of L )
\end_layout
\begin_layout Subsection
Operations on Time-Sets
\end_layout
\begin_layout Standard
Subset:
\family typewriter
S1
\begin_inset Formula $\subseteq$
\end_inset
S2
\end_layout
\begin_layout Standard
The notation remains semi-formal.
@-notation is often written so that it's intended to be distributed over
other notation.
For example:
\end_layout
\begin_layout Standard
\family typewriter
(SIZE(x @ A) == SIZE(x @ B)) @ P
\end_layout
\begin_layout Standard
is an abbreviation for
\end_layout
\begin_layout Standard
\family typewriter
SIZE(x @ A @ P) == SIZE(x @ B @ P)
\end_layout
\begin_layout Standard
The general idea is that in
\family typewriter
(expression) @ P
\family default
, where
\family typewriter
expression
\family default
is not a term or another time-set,
\family typewriter
P
\family default
should be
\begin_inset Quotes eld
\end_inset
pulled inside
\begin_inset Quotes erd
\end_inset
\family typewriter
expression
\family default
.
Rules for this are not explicitly formalized, but use of this syntax is
limited to simple cases where it should be clear what is intended -- certainly
far more clear than the usual English language description would be.
\end_layout
\begin_layout Section
Work List Lemma
\end_layout
\begin_layout Standard
Statement of the Lemma:
\end_layout
\begin_layout Standard
Let
\family typewriter
L
\family default
be a work list iteration.
\end_layout
\begin_layout Standard
Let
\family typewriter
W
\family default
be a work list.
\end_layout
\begin_layout Standard
Then
\family typewriter
0 <= j <= SIZE( W @ End of L )
\begin_inset Formula $\Rightarrow$
\end_inset
PASS( j, L )
\end_layout
\begin_layout Standard
And
\family typewriter
FINAL( SIZE( W @ End of L ), L)
\end_layout
\begin_layout Standard
\noun on
Proof
\noun default
: The proof is from the definitions of loop and work list.
\family typewriter
L
\family default
is defined as being an iteration over every item in
\family typewriter
W
\family default
, even those added during
\family typewriter
L
\family default
, so that any item added to
\family typewriter
W
\family default
before
\family typewriter
End of L
\family default
must be included in the iteration.
\end_layout
\begin_layout Standard
Only one operation is allowed on
\family typewriter
W
\family default
-- adding items to the end.
This operation does not move or delete any items, so no item in
\family typewriter
W
\family default
will ever be moved or deleted.
That means any iterated item will be present in
\family typewriter
W @( End of L)
\family default
, and in the same position as when it was iterated.
Therefore, if there is a
\family typewriter
W[j]@(End of L)
\family default
, the loop variable for pass j will be
\family typewriter
W[j]
\family default
.
\end_layout
\begin_layout Standard
\noun on
QED.
\end_layout
\begin_layout Section
Globals
\end_layout
\begin_layout Subsection
The Globals Assumption
\end_layout
\begin_layout Standard
For the purpose of the proof in this paper, it is assumed that AH2002 and
Marpa are initialized with the same input token stream and grammar.
\end_layout
\begin_layout Subsection
The Globals Lemmas
\end_layout
\begin_layout Standard
Statement of Lemma 1: Let G be a global.
\end_layout
\begin_layout Standard
Then G @ Start of Recognizer @ AH2002 == G @ Start of Recognizer @ Marpa.
\end_layout
\begin_layout Standard
Proof: From the definitions, we know that the Initialization code is shared
by the two algorithms.
From the Globals Assumption, we have that the results of Initialization
depends only on the input tokens and the grammar.
\end_layout
\begin_layout Standard
QED.
\end_layout
\begin_layout Standard
\noun on
Statement of Lemma
\noun default
2: Let G be global not in the Earley sets.
\end_layout
\begin_layout Standard
Let P be a time-set, P
\begin_inset Formula $\subseteq$
\end_inset
Recognizer.
\end_layout
\begin_layout Standard
Let Q be a time-set, P
\begin_inset Formula $\subseteq$
\end_inset
Recognizer.
\end_layout
\begin_layout Standard
Then For all G, where G is a global not in the Earley sets, G @ P == G @
Q.
\end_layout
\begin_layout Standard
Proof: The pseudocode and the definition of the Recognizer show that the
only globals that change in the Recognizer are those in the Earley Sets.
\end_layout
\begin_layout Standard
\noun on
Statement of Lemma 3
\noun default
: The Earley sets are changed only in the the following ways:
\end_layout
\begin_layout Itemize
S[0] is changed in Initialization;
\end_layout
\begin_layout Itemize
S[i] is changed in Completer @ Earley Set i Processing; and
\end_layout
\begin_layout Itemize
S[i+1] is changed in Scanner @ Earley Set i Processing.
\end_layout
\begin_layout Standard
\noun on
Proof
\noun default
: From the pseudocode and the definition of Initialization.
\end_layout
\begin_layout Standard
QED
\end_layout
\begin_layout Section
The Detailed Proof
\end_layout
\begin_layout Subsection
AH2002 Halting Axiom
\end_layout
\begin_layout Standard
In this paper, AH2002 will be assumed to halt for all finite grammars and
input token streams.
AH2002 in their paper do not explicitly prove that this is true in general,
but it is at the very least a safe conjecture.
\end_layout
\begin_layout Subsection
The Completer Lemma
\end_layout
\begin_layout Standard
In this lemma A may be either Marpa or AH2002.
\end_layout
\begin_layout Standard
\noun on
Statement of the Lemma
\noun default
:
\end_layout
\begin_layout Standard
Assume p is a pass in the Completer Loop.
\end_layout
\begin_layout Standard
Assume that
\family typewriter
((state, parent, i) @ AH2002 == (state, parent, i) @ Marpa) @ Top of Completer
Loop(p)
\family default
.
\end_layout
\begin_layout Standard
Assume
\family typewriter
(S[i] @ A == S[i] @ B) @ Top of Completer Loop(p)
\family default
.
\end_layout
\begin_layout Standard
Then
\family typewriter
(S[i] @ A == S[i] @ B) @ Bottom of Completer Loop(p)
\family default
.
\end_layout
\begin_layout Standard
\noun on
Starting the proof
\noun default
:
\end_layout
\begin_layout Standard
\family typewriter
0 <= p <= SIZE(S[i] @ End of Completer Loop)
\family default
.
By the Loop Range Property.
\end_layout
\begin_layout Standard
The proof proceeds in two cases -- the Final Case
\family typewriter
(p == SIZE(S[i] @ End of Completer Loop))
\family default
and the Kernel Case
\family typewriter
(0 <= p < SIZE(S[i] @ End of Completer Loop))
\family default
.
\end_layout
\begin_layout Standard
\noun on
Kernel case
\noun default
:
\end_layout
\begin_layout Standard
\family typewriter
((state, parent, i) @ AH2002 == (state, parent, i) @ Marpa) @ Top of Completer
Loop(p)
\family default
.
By assumption in the statement of the lemma.
\end_layout
\begin_layout Standard
\family typewriter
((state, parent, i) @ AH2002 == (state, parent, i) @ Marpa) @ Top of Completer
@ Completer Loop(p)
\family default
.
From the statement just previous and inspection of the psuedocode.
These variables are never changed between
\family typewriter
Top of Completer Loop(p)
\family default
and
\family typewriter
Top of Completer @ Completer Loop(p)
\family default
.
\end_layout
\begin_layout Standard
For
\family typewriter
G
\family default
in LHS, GOTO, COMPLETED,
\family typewriter
(G @ Marpa == G @ AH2002) @ Top of Completer @ Completer Loop(p)
\family default
.
By the Globals Lemma.
\end_layout
\begin_layout Standard
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ Start of Completer @ Completer Loop(p
\family default
).
By assumption in the statement of the lemma.
\end_layout
\begin_layout Standard
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ End of Completer @ Completer Loop(p)
\family default
.
The Completer pseudocode shows that S[i] @ End of Completer is a deterministic
function of S[i] itself, the Completer's input variables and these globals:
LHS, GOTO, COMPLETED.
By the previous 3 grafs, these are the same in both algorithms.
The Completer is also the same in both algorithms.
Therefore the effects of the Completer must be the same in both algorithms.
\end_layout
\begin_layout Standard
\family typewriter
(S[i] @ AH2002 == S[i+f] @ Marpa) @ Bottom of Completer Loop(p)
\family default
.
From the pseudocode.
The current Earley set is never changed between
\family typewriter
Bottom of Completer @ Completer Loop(p)
\family default
and
\family typewriter
Bottom of Completer Loop(p)
\family default
.
\end_layout
\begin_layout Standard
\noun on
This proves the Kernel case
\noun default
.
\end_layout
\begin_layout Standard
\noun on
Final case
\noun default
:
\end_layout
\begin_layout Standard
\family typewriter
FINAL(p, Completer Loop)
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
(S[i] @ Marpa == S[i] @ AH2002) @ Bottom of Completer Loop(p)
\family default
.
From the previous graf and an assumption in the statement of this lemma.
By the definition of the final pass, it is a no-op.
The Completer is not called and
\family typewriter
S[i]
\family default
remains to its value at the
\family typewriter
Top Completer Loop(p)
\family default
.
\end_layout
\begin_layout Standard
\noun on
This proves the final case
\noun default
.
\end_layout
\begin_layout Standard
With the two cases proved, the Lemma is shown.
\end_layout
\begin_layout Standard
QED.
\end_layout
\begin_layout Subsection
The Completer Loop Lemma
\end_layout
\begin_layout Standard
When a task property Lemma is used in this Lemma as a justification, and
no task is mentioned, the task is the Completer pseudocode, the loop is
the Completer Loop and the offset is 1, and the Completer Task Lemma is
to be considered part of the justification.
\end_layout
\begin_layout Standard
\noun on
Statement of the Lemma
\noun default
:
\end_layout
\begin_layout Standard
Let
\family typewriter
p
\family default
be a pass in
\family typewriter
Completer Loop @ Earley Set i Processing
\family default
.
\end_layout
\begin_layout Standard
Assume for all globals G not in the Earley sets,
\family typewriter
(G @ Marpa == G @ AH2002) @ Earley Set i Processing)
\family default
.
\end_layout
\begin_layout Standard
Assume
\family typewriter
(i @ Marpa == i @ AH2002) @ Earley Set Processing
\family default
.
\end_layout
\begin_layout Standard
Assume
\family typewriter
(S[i] @ Marpa == S[i] @ AH2002) @ Top(p) of Completer Loop
\family default
(p).
\end_layout
\begin_layout Standard
Then
\family typewriter
(S[i] @ Marpa == S[i] @ AH2002) @ Bottom of Completer Loop(p)
\family default
.
\end_layout
\begin_layout Standard
\noun on
Proof:
\end_layout
\begin_layout Standard
\family typewriter
(S[i][p] @ AH2002 == S[i][p] @ Marpa) @ Top of Completer Loop(p)
\family default
.
By assumptions in for the lemma.
\end_layout
\begin_layout Standard
\family typewriter
((state, parent) @ AH2002 == (state, parent) @ Marpa) @ Top of Completer
Loop(p)
\family default
.
From the statement just previous.
\family typewriter
(state, parent) == S[i][p]
\family default
because both are the loop variable.
\end_layout
\begin_layout Standard
\family typewriter
(i @ AH2002 == i @ Marpa) @ Top of Completer Loop (p)
\family default
.
From an assumption for this lemma.
\end_layout
\begin_layout Standard
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ Top of Completer Loop(p)
\family default
.
An assumption for this lemma.
\end_layout
\begin_layout Standard
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ Bottom of Completer Loop(p)
\family default
.
By the Completer Lemma with
\family typewriter
p
\family default
set to its value in this lemma, and by the three previous grafs.
\end_layout
\begin_layout Standard
\noun on
This proves the lemma.
\end_layout
\begin_layout Subsection
The Current Earley Set Lemma
\end_layout
\begin_layout Standard
\noun on
Statement of the Lemma
\noun default
:
\end_layout
\begin_layout Standard
Assume
\family typewriter
0 <= i <= SIZE(x)
\family default
.
\end_layout
\begin_layout Standard
Assume
\family typewriter
(S @ AH2002 == S @ Marpa) @ Start of Earley Set i Processing
\family default
.
\end_layout
\begin_layout Standard
Then,
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ End of Earley Set i Processing
\family default
.
\end_layout
\begin_layout Standard
\noun on
Starting the Proof
\noun default
:
\end_layout
\begin_layout Standard
\noun on
Sublemma
\noun default
:
\end_layout
\begin_layout Standard
Assume
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ Top of Completer Loop(0)
\family default
.
\end_layout
\begin_layout Standard
Assume
\family typewriter
p <= SIZE(S[i]) @ AH2002 @ Bottom of Completer Loop(p)
\family default
or
\family typewriter
p <= SIZE(S[i]) @ Marpa @ Bottom of Completer Loop(p)
\family default
.
\end_layout
\begin_layout Standard
Then
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ Bottom of Completer Loop(p)
\family default
.
\end_layout
\begin_layout Standard
\noun on
Note
\noun default
: In this sublemma, it is assumed in the statement of the sublemma that
pass p exists for one algorithm, but not for the other.
Loop operations are not defined for non-existent passes, and it must be
proved that pass p exists for both AH2002 and Marpa.
\end_layout
\begin_layout Standard
\noun on
Starting the proof
\noun default
:
\end_layout
\begin_layout Standard
\noun on
Induction Property
\noun default
:
\family typewriter
(S[i] @ A == S[i] @ Marpa) @ Bottom of Completer Loop(p)
\family default
, where the induction is on
\family typewriter
p
\family default
.
\end_layout
\begin_layout Standard
\noun on
Induction
\noun default
\noun on
Basis
\noun default
:
\family typewriter
(S[i] @ A == S[i] @ Marpa) @ Bottom of Completer Loop(0)
\family default
.
By the assumptions in the statement of the sublemma and the Completer Loop
Lemma with
\family typewriter
p := 0
\family default
.
Pass 0 exists for every loop, by definition.
\end_layout
\begin_layout Standard
\noun on
This proves the induction basis.
\end_layout
\begin_layout Standard
\noun on
Induction hypothesis
\noun default
:
\end_layout
\begin_layout Standard
Assume
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ Bottom of Completer Loop(n)
\family default
.
\end_layout
\begin_layout Standard
\noun on
Inductive step
\noun default
:
\end_layout
\begin_layout Standard
Note: Next it must be shown that
\family typewriter
n+1
\family default
is a pass in both AH2002 and Marpa.
\end_layout
\begin_layout Standard
\noun on
Subsublemma
\noun default
: Let
\family typewriter
(A,B)
\family default
be either
\family typewriter
(AH2002, Marpa)
\family default
or
\family typewriter
(Marpa, AH2002)
\family default
.
\end_layout
\begin_layout Standard
Assume
\family typewriter
PASS(n+1, Completer Loop @ A)
\family default
.
\end_layout
\begin_layout Standard
Then
\family typewriter
PASS(n+1, Completer Loop @ B)
\family default
.
\end_layout
\begin_layout Standard
Starting the Proof:
\end_layout
\begin_layout Standard
\family typewriter
0 <= n < SIZE(DOMAIN(Completer Loop) @ Bottom of Completer Loop(n) @ A))
\family default
.
By the assumption for the sublemma, and the definition of PASS.
\end_layout
\begin_layout Standard
\family typewriter
DOMAIN(Completer Loop)
\family default
is
\family typewriter
S[i].
\family default
By the definition of
\family typewriter
DOMAIN
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
0 <= n+1 <= SIZE(S[i] @ Bottom of Completer Loop(n) @ A)
\family default
.
By the previous 2 grafs.
[
\noun on
Current Set Size Graf
\noun default
.
]
\end_layout
\begin_layout Standard
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ Bottom of Completer Loop(n).
\family default
By the Induction Hypothesis.
The assumption for the case is that f == 0.
\end_layout
\begin_layout Standard
\family typewriter
(S[i] @ A == S[i] @ B) @ Bottom of Completer Loop(n).
\family default
By the previous, substituting A and B, and transposing the equality if
necessary.
\end_layout
\begin_layout Standard
\family typewriter
0 < n+1 <= SIZE(S[i] @ Bottom of Completer Loop(n) @ B)
\family default
.
By the previous graf and Current Set Size Graf.
\end_layout
\begin_layout Standard
\family typewriter
0 < n+1 <= SIZE(DOMAIN(Completer Loop) @ Bottom of Completer Loop(n) @ A))
\family default
.
By the previous graf and the definition of
\family typewriter
DOMAIN
\family default
.
The domain of the Completer Loop is
\family typewriter
S[i]
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
PASS(n+1, Completer Loop @ B)
\family default
.
By the previous graf and the definition of PASS.
\end_layout
\begin_layout Standard
\noun on
This proves the subsublemma
\noun default
.
\end_layout
\begin_layout Standard
\family typewriter
PASS(n+1, Completer Loop @ AH2002)
\begin_inset Formula $\Rightarrow$
\end_inset
PASS(n+1, Completer Loop @ Marpa)
\family default
.
By the sublemma, with
\family typewriter
(A, B) := (AH2002, Marpa)
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
PASS(n+1, Completer Loop @ Marpa)
\begin_inset Formula $\Rightarrow$
\end_inset
PASS(n+1, Completer Loop @ AH2002)
\family default
.
By the sublemma, with
\family typewriter
(A, B) := (Marpa, AH2002)
\family default
.
\end_layout
\begin_layout Standard
Pass Equivalence Graf:
\family typewriter
PASS(n+1, Completer Loop @ AH2002)
\begin_inset Formula $\Leftrightarrow$
\end_inset
PASS(n+1, Completer Loop @ Marpa)
\family default
.
By the previous two grafs.
\end_layout
\begin_layout Standard
\family typewriter
PASS(n+1, Completer Loop @ AH2002)
\family default
.
By the assumption for Sublemma 1 and the Pass Equivalence Graf.
\end_layout
\begin_layout Standard
\family typewriter
PASS(n+1, Completer Loop @ Marpa)
\family default
.
By the assumption for Sublemma 1 and the Pass Equivalence Graf.
\end_layout
\begin_layout Standard
\family typewriter
(S[i+f] @ AH2002 == S[i+f] @ Marpa) @ Bottom of Completer Loop(p)
\family default
.
By the previous 2 grafs and the Earley Item Induction Property
\end_layout
\begin_layout Standard
\noun on
This proves the induction
\noun default
.
\end_layout
\begin_layout Standard
\noun on
This proves the sublemma
\noun default
.
\end_layout
\begin_layout Standard
\noun on
Note
\noun default
: The sublemma does most of the work of this Lemma.
In statements that follow, I show
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa)
\family default
holds as we proceed through the Earley Set Processing.
To do this, I use the sublemma for the portion of the Earley Set Processing
where
\family typewriter
S[i]
\family default
might change.
Where
\family typewriter
S[i]
\family default
remains unchanged in both algorithms, this is established by referring
to the pseudocode.
\end_layout
\begin_layout Standard
(
\family typewriter
S[i] @ AH2002 == S[i] @ Marpa) @ Start of Earley Set i Processing
\family default
.
From the assumption in the Statement of the Lemma, substituting
\family typewriter
S[i]
\family default
for
\family typewriter
S
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
((S[i] @ AH2002 == S[i] @ Marpa) @ Top of Completer Loop(0)) @ Earley Set
i Processing
\family default
.
From the pseudocode and the previous graf.
[ Top Identical Graf.
]
\end_layout
\begin_layout Standard
Let
\family typewriter
z
\family default
be the pass such that
\family typewriter
(AH2002 @ End of Completer Loop == AH2002 @ Bottom of Completer Loop(z))
@ Earley Set i Processing
\family default
.
By the AH2002 Halting Axiom there is such a pass.
\end_layout
\begin_layout Standard
\family typewriter
(S[i] @ AH2002 @ Bottom of Completer Loop(z) == S[i] @ Marpa @ Bottom of
Completer Loop(z)) @ Earley Set i Processing
\family default
.
By the previous two statements and sublemma.
Pass
\family typewriter
z
\family default
exists for AH2002 by definition.
That it exists for Marpa will be shown by the sublemma.
\end_layout
\begin_layout Standard
\family typewriter
(S[i] @ Marpa @ End of Completer Loop == S[i] @ Marpa @ Bottom of Completer
Loop(z)) @ Earley Set i Processing
\family default
.
By the last two statements and the loop definitions.
If at pass z the Earley sets are identical, and the Earley set in one algorithm
ends the loop, it must also end the loop in the other algorithm at pass
z.
\end_layout
\begin_layout Standard
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ End of Completer Loop @ Earley Set i Processin
g
\family default
.
By the three statements just previous.
\end_layout
\begin_layout Standard
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ End of Earley Set i Processing
\family default
.
From the pseudocode.
\end_layout
\begin_layout Standard
\noun on
This proves the Lemma.
\end_layout
\begin_layout Subsection
The Scanner Lemma
\end_layout
\begin_layout Standard
\noun on
Statement of the Lemma
\noun default
:
\end_layout
\begin_layout Standard
Assume
\family typewriter
PASS(p,
\family default
\family typewriter
Scanner Loop
\family default
).
\end_layout
\begin_layout Standard
Assume that
\family typewriter
((state, parent) @ AH2002 == (state, parent) @ Marpa) @ Top of Scanner Loop
\family default
(p).
\end_layout
\begin_layout Standard
Assume that
\family typewriter
(i @ AH2002 == i @ Marpa) @ Top of Scanner Loop(p)
\family default
.
\end_layout
\begin_layout Standard
Assume
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Top of Scanner Loop(p)
\family default
.
\end_layout
\begin_layout Standard
Then
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Bottom of Scanner Loop(p).
\end_layout
\begin_layout Standard
\noun on
Starting the proof
\noun default
:
\end_layout
\begin_layout Standard
\family typewriter
0 <= p <= SIZE(S[i] @ End of Scanner Loop)
\family default
.
From as assumption for the lemma and the definition of
\family typewriter
PASS
\family default
.
\end_layout
\begin_layout Standard
The proof proceeds in two cases -- the Final Case (
\family typewriter
p == SIZE(S[i] @ End of Scanner Loop)
\family default
) and the Kernel Case (
\family typewriter
0 <= p < SIZE(S[i] @ End of Scanner Loop)
\family default
).
\end_layout
\begin_layout Standard
\noun on
Kernel case
\noun default
:
\end_layout
\begin_layout Standard
\family typewriter
((state, parent, i) @ AH2002 == (state, parent, i) @ Marpa) @ Top of Scanner
Loop(p)
\family default
.
By assumption in the statement of the lemma.
\end_layout
\begin_layout Standard
\family typewriter
((state, parent, i) @ AH2002 == (state, parent, i) @ Marpa) @ Top of Scanner
@ Scanner Loop(p)
\family default
.
From the graf just previous and the psuedocode.
These variables are never changed between
\family typewriter
Top of Scanner Loop(p)
\family default
and
\family typewriter
Top of Scanner @ Scanner Loop(p)
\family default
.
[
\noun on
Input Variables Graf ]
\end_layout
\begin_layout Standard
\family typewriter
G
\family default
\begin_inset Formula $\in$
\end_inset
\family typewriter
(x, GOTO)
\family default
\begin_inset Formula $\Rightarrow$
\end_inset
\family typewriter
(G @ Marpa == G @ AH2002) @ Top of Scanner @ Scanner Loop(p)
\family default
.
By the Globals Lemmas.
[
\noun on
Globals Graf ]
\end_layout
\begin_layout Standard
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Top of Scanner Loop(p)
\family default
.
By an assumption in the statement of the lemma.
\end_layout
\begin_layout Standard
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Top of Scanner @ Scanner Loop(p)
\family default
.
From the graf just previous and inspection of the pseudocode.
This Earley set is never changed outside of Scanner.
[
\noun on
Next Earley Set Graf ]
\end_layout
\begin_layout Standard
\family typewriter
S[i+1] @ AH2002 @ Bottom of Scanner @ Scanner Loop(p) == S[i+1] @ Marpa
@ Bottom of Scanner @ Scanner Loop(p)
\family default
.
From the pseudocode, the Input Variables Statement, the Globals Statement
and Destination Statement.
Inspection of Scanner shows that
\family typewriter
S[i+1] @ End of Scanner
\family default
is a deterministic function.
Inputs of the function are
\family typewriter
S[i+1]
\family default
itself, the
\family typewriter
Scanner
\family default
's input variables and the globals
\family typewriter
x
\family default
and
\family typewriter
GOTO
\family default
.
These inputs are the same in both algorithms by the Next Earley Set Graf,
the Input Variables Graf and the Globals Graf.
The
\family typewriter
Scanner
\family default
is also the same in both algorithms.
Therefore the result must be the same.
\end_layout
\begin_layout Standard
\family typewriter
(S[i+1] @ AH2002 == S[i+f] @ Marpa) @ Bottom of Scanner Loop(p)
\family default
.
From the pseudocode.
This Earley set is never changed between
\family typewriter
Bottom of Scanner @ Scanner Loop(p)
\family default
and
\family typewriter
Bottom of Scanner Loop(p)
\family default
.
\end_layout
\begin_layout Standard
\noun on
This proves the Kernel case
\noun default
.
\end_layout
\begin_layout Standard
\noun on
Final case
\noun default
:
\end_layout
\begin_layout Standard
\noun on
Assumption for this case
\noun default
:
\family typewriter
FINAL(p, Scanner Loop).
\end_layout
\begin_layout Standard
\family typewriter
(S[i+1] @ Marpa == S[i+1] @ AH2002) @ Bottom of Scanner Loop(p)
\family default
.
From an assumption in the statement of this lemma, the graf just previous
and the loop definitions.
The final pass of a loop is always a no-op.
\end_layout
\begin_layout Standard
\noun on
This proves the final case
\noun default
.
\end_layout
\begin_layout Standard
With the two cases proved, the Lemma is shown.
\end_layout
\begin_layout Standard
QED.
\end_layout
\begin_layout Subsection
The Scanner Loop Lemma
\end_layout
\begin_layout Standard
\noun on
Statement of the Lemma
\noun default
:
\end_layout
\begin_layout Standard
Assume that
\family typewriter
PASS(p,
\family default
\family typewriter
Scanner Loop @ Earley Set i Processing
\family default
).
\end_layout
\begin_layout Standard
Assume that
\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Earley Set i Processing
\family default
.
\end_layout
\begin_layout Standard
Assume
\family typewriter
(i @ Marpa == i @ AH2002) @ Earley Set Processing
\family default
.
\end_layout
\begin_layout Standard
Assume
\family typewriter
(S[i+1] @ Marpa == S[i+1] @ AH2002) @ Top of Scanner Loop(p)
\family default
.
\end_layout
\begin_layout Standard
Then
\family typewriter
(S[i+1] @ Marpa == S[i+1] @ AH2002) @ Bottom of Scanner Loop (p)
\family default
.
\end_layout
\begin_layout Standard
\noun on
Proof:
\end_layout
\begin_layout Standard
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ End of Earley Set i Processing
\family default
.
By an assumption for the lemma, with S[i] substituted for S.
\end_layout
\begin_layout Standard
\family typewriter
(S[i][p] @ AH2002 == S[i][p] @ Marpa) @ End of Earley Set i Processing
\family default
.
From the statement just previous.
p exists in both algorithms because S[i] is the domain of Scanner Loop,
and by assumption for the lemma, p is a pass.
\end_layout
\begin_layout Standard
\family typewriter
End of Scanner Loop @ Earley Set i Processing == End of Earley Set i Processing
\family default
.
From the pseudocode.
Recall that
\family typewriter
Combined Loop @ AH2002 == Scanner Loop @ AH2002
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
(S[i][p] @ AH2002 == S[i][p] @ Marpa) @ End of Scanner Loop @ Earley Set
i Processing
\family default
.
From the previous 2 grafs.
\end_layout
\begin_layout Standard
\family typewriter
(S[i][p] @ Top of Scanner Loop(p) == S[i][p] @ End of Scanner Loop) @ Earley
Set i Processing
\family default
.
S[i][p] exists at
\family typewriter
Top of Scanner Loop(p)
\family default
by the definition of PASS.
It is identical to S[i][p] at End of Scanner Loop because worklist items,
once they exist, are never changed or deleted.
\end_layout
\begin_layout Standard
\family typewriter
(S[i][p] @ AH2002 == S[i][p] @ Marpa) @ Top of Scanner Loop(p) @ Earley
Set i Processing
\family default
.
From the previous 2 grafs.
\end_layout
\begin_layout Standard
\family typewriter
((state, parent) == S[i][p]) @ Top of Scanner Loop(p) @ Earley Set i Processing
\family default
.
Because (state, parent) and S[i][p] are two different names for the loop
variable, they must be identical.
\end_layout
\begin_layout Standard
\family typewriter
((state, parent) @ AH2002 == (state, parent) @ Marpa) @ Top of Scanner Loop(p)
@ Earley Set i Processing
\family default
.
From the previous two statements.
\end_layout
\begin_layout Standard
\family typewriter
(i @ AH2002 == i @ Marpa) @ Top of Scanner Loop (p)
\family default
.
From an assumption in the statement of this lemma.
\end_layout
\begin_layout Standard
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Top of Scanner Loop(p)
\family default
.
An assumption in the statement of this lemma.
\end_layout
\begin_layout Standard
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Bottom of Scanner Loop(p)
\family default
.
By the Scanner Lemma with p set to its value in this Lemma, and by the
three statements just previous.
\end_layout
\begin_layout Standard
\noun on
This proves the lemma.
\end_layout
\begin_layout Subsection
The Next Earley Set Lemma
\end_layout
\begin_layout Standard
\noun on
Statement of the Lemma
\noun default
:
\end_layout
\begin_layout Standard
Assume
\family typewriter
SIZE(S[i] @ AH2002) == SIZE(S[i] @ Marpa) @ End of Earley Set i Processing
\family default
.
\end_layout
\begin_layout Standard
Assume
\family typewriter
0 <= i <= SIZE(x)
\family default
.
\end_layout
\begin_layout Standard
Assume
\family typewriter
(S @ AH2002 == S @ Marpa) @ Start of Earley Set i Processing
\family default
.
\end_layout
\begin_layout Standard
Then,
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ End of Earley Set i Processing
\family default
.
\end_layout
\begin_layout Standard
\noun on
Starting the Proof
\noun default
:
\end_layout
\begin_layout Standard
\noun on
Sublemma
\noun default
:
\end_layout
\begin_layout Standard
Assume
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Top of Scanner Loop(0)
\family default
.
\end_layout
\begin_layout Standard
Assume
\family typewriter
p <= SIZE(S[i+f]) @ AH2002 @ Bottom of Scanner Loop(p)
\family default
or
\family typewriter
p <= SIZE(S[i+f]) @ Marpa @ Bottom of Scanner Loop(p)
\family default
.
\end_layout
\begin_layout Standard
Then
\family typewriter
(S[i+f] @ AH2002 == S[i+f] @ Marpa) @ Bottom of Scanner Loop(p)
\family default
.
\end_layout
\begin_layout Standard
\noun on
Note
\noun default
: In this sublemma, it is assumed in the statement of the sublemma that
pass p exists for one algorithm, but not for the other.
Loop operations are not defined for non-existent passes, and it must be
proved that pass p exists for both AH2002 and Marpa.
\end_layout
\begin_layout Standard
\noun on
Starting the proof
\noun default
:
\end_layout
\begin_layout Standard
\noun on
Induction Property
\noun default
:
\family typewriter
(S[i+f] @ A == S[i+f] @ Marpa) @ Bottom of Scanner Loop(p)
\family default
, where the induction is on
\family typewriter
p
\family default
.
\end_layout
\begin_layout Standard
\noun on
Induction
\noun default
\noun on
Basis
\noun default
:
\family typewriter
(S[i+f] @ A == S[i+f] @ Marpa) @ Bottom of Scanner Loop(0)
\family default
.
By the assumptions in the statement of Sublemma 1 and the Earley Item Induction
Property Lemma with p == 0.
Pass 0 always exists for every loop, by definition.
\end_layout
\begin_layout Standard
\noun on
This proves the induction basis.
\end_layout
\begin_layout Standard
\noun on
Induction hypothesis
\noun default
:
\end_layout
\begin_layout Standard
Assume
\family typewriter
(S[i+f] @ AH2002 == S[i+f] @ Marpa) @ Bottom of Scanner Loop(n)
\family default
.
\end_layout
\begin_layout Standard
\noun on
Inductive step
\noun default
:
\end_layout
\begin_layout Standard
Note: Next it must be shown that
\family typewriter
n+1
\family default
is a pass in both
\family typewriter
AH2002
\family default
and
\family typewriter
Marpa
\family default
.
\end_layout
\begin_layout Standard
\noun on
Subsublemma
\noun default
: Let
\family typewriter
(A,B)
\family default
be either
\family typewriter
(AH2002, Marpa)
\family default
or
\family typewriter
(Marpa, AH2002)
\family default
.
\end_layout
\begin_layout Standard
Assume
\family typewriter
PASS(n+1, Scanner Loop @ A)
\family default
.
\end_layout
\begin_layout Standard
Then
\family typewriter
PASS(n+1, Scanner Loop @ B)
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
SIZE(S[i] @ AH2002) == SIZE(S[i] @ Marpa) @ End of Earley Set i Processing
\family default
.
From an assumption for this lemma.
\end_layout
\begin_layout Standard
\family typewriter
(S[i] @ A == S[i] @ B) @ End of Earley Set i Processing.
\family default
By the previous, substituting A and B, and transposing the equality if
necessary.
\end_layout
\begin_layout Standard
\family typewriter
(S[i] @ A == S[i] @ B) @ End of Scanner Loop.
\family default
From the previous graf and the pseudocode.
Nothing between
\family typewriter
End of Scanner Loop
\family default
and
\family typewriter
End of Earley Set i Processing
\family default
changes
\family typewriter
S[i].
[
\family default
\noun on
End of Scanner Loop Equality Graf
\noun default
]
\end_layout
\begin_layout Standard
\family typewriter
0 <= n < SIZE(DOMAIN(Scanner Loop) @ Bottom of Scanner Loop(n) @ A))
\family default
.
By assumption for the sublemma and the definition of
\family typewriter
PASS
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
0 <= n+1 <= SIZE(S[i] @ Bottom of Scanner Loop(n) @ A)
\family default
.
By the previous graf, the loop definitions.
By the definition of
\family typewriter
DOMAIN
\family default
,
\family typewriter
DOMAIN(Scanner Loop) == S[i]
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
SIZE(S[i] @ Bottom of Scanner Loop(n) @ A) <= SIZE(S[i] @ End of Scanner
Loop @ A)
\family default
.
By the loop definitions.
End of Scanner Loop is at or after Bottom of Scanner Loop(n) and the loop
domain never decreases in size.
\end_layout
\begin_layout Standard
\family typewriter
0 <= n+1 <= SIZE(S[i] @ End of Scanner Loop @ A)
\family default
.
From the previous 2 grafs.
\end_layout
\begin_layout Standard
\family typewriter
0 <= n+1 <= SIZE(S[i] @ End of Scanner Loop @ B)
\family default
.
From the previous and the
\family typewriter
End of Scanner Loop Equality Graf
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
PASS(n+1, Scanner Loop @ B)
\family default
.
From the previous graf and definition of
\family typewriter
PASS
\family default
.
\end_layout
\begin_layout Standard
\noun on
This proves the subsublemma
\noun default
.
\end_layout
\begin_layout Standard
\family typewriter
PASS(n+1, Scanner Loop @ AH2002)
\begin_inset Formula $\Rightarrow$
\end_inset
PASS(n+1, Scanner Loop @ Marpa)
\family default
.
By the subsublemma, with
\family typewriter
(A, B) := (AH2002, Marpa)
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
PASS(n+1, Scanner Loop @ Marpa)
\begin_inset Formula $\Rightarrow$
\end_inset
PASS(n+1, Scanner Loop @ AH2002)
\family default
.
By the subsublemma, with
\family typewriter
(A, B) := (Marpa, AH2002)
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
PASS(n+1, Scanner Loop @ AH2002)
\begin_inset Formula $\Leftrightarrow$
\end_inset
PASS(n+1, Scanner Loop @ Marpa)
\family default
.
By the previous two grafs.
[
\noun on
Pass Equivalence Graf
\noun default
]
\end_layout
\begin_layout Standard
\family typewriter
PASS(n+1, Scanner Loop @ AH2002)
\family default
.
By the assumption for the sublemma and the Pass Equivalence Graf.
\end_layout
\begin_layout Standard
\family typewriter
PASS(n+1, Scanner Loop @ Marpa)
\family default
.
By the assumption for the sublemma and the Pass Equivalence Graf.
\end_layout
\begin_layout Standard
\family typewriter
(S[i+f] @ AH2002 == S[i+f] @ Marpa) @ Top of Scanner Loop(p+1)
\family default
.
By the loop definitions and the induction hypothesis.
The two grafs just previous show that pass
\family typewriter
n+1
\family default
exists for both algorithms.
\end_layout
\begin_layout Standard
\family typewriter
(S[i+f] @ AH2002 == S[i+f] @ Marpa) @ Bottom of Scanner Loop(p+1)
\family default
.
By the previous graf and the Scanner Loop Lemma.
\end_layout
\begin_layout Standard
\noun on
This proves the induction
\noun default
.
\end_layout
\begin_layout Standard
\noun on
This proves sublemma
\noun default
.
\end_layout
\begin_layout Standard
\noun on
Note
\noun default
: The sublemma does most of the work of this lemma.
In statements that follow, I show
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa)
\family default
holds as we proceed through the
\family typewriter
Earley Set Processing
\family default
.
To do this, I use the sublemma for the portion of the
\family typewriter
Earley Set Processing
\family default
where
\family typewriter
S[i+1]
\family default
might change.
In most cases
\family typewriter
S[i+1]
\family default
remains unchanged in both algorithms, as is clear from the pseudocode.
\end_layout
\begin_layout Standard
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Start of Earley Set i Processing
\family default
.
From the assumption in the Statement of the Lemma, substituting
\family typewriter
S[i+1]
\family default
for
\family typewriter
S
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
((S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Top of Scanner Loop(0)) @ Earley
Set i Processing
\family default
.
From the pseudocode and the previous graf.
[
\noun on
Top Identical Graf
\noun default
]
\end_layout
\begin_layout Standard
Let
\family typewriter
z
\family default
be the pass such that
\family typewriter
(AH2002 @ End of Scanner Loop == AH2002 @ Bottom of Scanner Loop(z)) @ Earley
Set i Processing
\family default
.
By the AH2002 Halting Axiom there is such a pass.
\end_layout
\begin_layout Standard
\family typewriter
(S[i+1] @ AH2002 == S[i+f] @ Marpa) @ Bottom of Scanner Loop(z) @ Earley
Set i Processing
\family default
.
By the previous two statements and sublemma.
Pass
\family typewriter
z
\family default
exists for
\family typewriter
AH2002
\family default
by definition.
That it exists for
\family typewriter
Marpa
\family default
will be shown by the sublemma.
\end_layout
\begin_layout Standard
\family typewriter
(S[i+1] @ Marpa @ End of Scanner Loop == S[i+1] @ Marpa @ Bottom of Scanner
Loop(z)) @ Earley Set i Processing
\family default
.
By the last two statements and the loop definitions.
If at pass
\family typewriter
z
\family default
the Earley sets are identical, and the Earley set in one algorithm ends
the loop, it must also end the loop in the other algorithm at pass
\family typewriter
z
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ End of Scanner Loop @ Earley Set i
Processing
\family default
.
By the three statements just previous.
\end_layout
\begin_layout Standard
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ End of Earley Set i Processing
\family default
.
From the pseudocode.
\end_layout
\begin_layout Standard
\noun on
This proves the Lemma.
\end_layout
\begin_layout Subsection
Earley Set Processing Lemma
\end_layout
\begin_layout Standard
\noun on
Statement of the Lemma
\noun default
:
\end_layout
\begin_layout Standard
Assume for
\family typewriter
(S @ Marpa == S @ AH2002) @ Start of Earley Set n Processing
\family default
.
\end_layout
\begin_layout Standard
Assume also, that
\family typewriter
0 < n < SIZE(x)
\family default
.
\end_layout
\begin_layout Standard
Then
\family typewriter
(S @ Marpa == S @ AH2002) @ End of Earley Set n Processing
\family default
.
\end_layout
\begin_layout Standard
\noun on
Starting the Proof
\noun default
:
\end_layout
\begin_layout Standard
For all
\family typewriter
x
\family default
,
\family typewriter
(x == n or x == n+1 or S[x] @ Marpa == S[x] @ AH2002) @ End of Earley Set
n Processing
\family default
.
From the Globals Lemma, and the assumption in the Statement of the Lemma.
\end_layout
\begin_layout Standard
\family typewriter
(S[n] @ Marpa == S[n] @ AH2002) @ End of Earley Set n Processing
\family default
.
From the Current Earley Set Lemma
\family typewriter
(i := n)
\family default
, and the assumptions in the Statement of the Lemma.
\end_layout
\begin_layout Standard
\family typewriter
(S[n+1] @ Marpa == S[n+1] @ AH2002) @ End of Earley Set n Processing
\family default
.
From the Next Earley Set Lemma
\family typewriter
(i := n)
\family default
, the previous graf, and the assumptions in the Statement of the Lemma.
\end_layout
\begin_layout Standard
For all x,
\family typewriter
(S[x] @ Marpa == S[x] @ AH2002) @ End of Earley Set n Processing
\family default
.
Combining the three statements just previous.
\end_layout
\begin_layout Standard
\family typewriter
(S @ Marpa == S @ AH2002) @ End of Earley Set n Processing
\family default
.
From the previous statement.
\end_layout
\begin_layout Standard
QED.
\end_layout
\begin_layout Subsection
AH2002 and Marpa are Equivalent
\end_layout
\begin_layout Standard
We regard AH2002 and Marpa as equivalent if, given the Globals Assumption,
the Earley sets are identical at the end of each algorithm.
The Globals Assumption has been assumed throughout this proof.
\end_layout
\begin_layout Standard
\noun on
Statement of the Theorem
\noun default
:
\family typewriter
S @ End of AH2002 == S @ End of Marpa
\family default
.
\end_layout
\begin_layout Standard
\noun on
lemma
\noun default
: Assume
\family typewriter
0 < s < SIZE(x)
\family default
.
\end_layout
\begin_layout Standard
Then
\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer Step s
\family default
.
\end_layout
\begin_layout Standard
\noun on
Starting the Proof
\noun default
: The proof is by induction on the Recognizer Steps.
\end_layout
\begin_layout Standard
\noun on
Induction Property
\noun default
:
\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer Step t
\family default
, where the induction is on
\family typewriter
t
\family default
.
\end_layout
\begin_layout Standard
\noun on
Induction Base
\noun default
:
\end_layout
\begin_layout Standard
\family typewriter
(S @ AH2002 == S @ Marpa) @ Start of Recognizer Step 0
\family default
.
By the Globals Lemmas.
\end_layout
\begin_layout Standard
\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer Step 0
\family default
.
By the just previous statement, the Globals Lemmas and the Earley Sets
Processing Lemma with
\family typewriter
n := 0
\family default
.
There will always be an Earley set 0 and therefore by the definition of
the Recognizer, a Recognizer Step 0.
\end_layout
\begin_layout Standard
\noun on
Induction Hypothesis
\noun default
:
\end_layout
\begin_layout Standard
\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer Step t
\family default
.
\end_layout
\begin_layout Standard
\noun on
Inductive Step
\noun default
:
\end_layout
\begin_layout Standard
\family typewriter
0 <= t+1 < SIZE(x)
\family default
.
From the statement of the Sublemma.
[
\noun on
Induction Limit Graf ]
\end_layout
\begin_layout Standard
\family typewriter
(S @ AH2002 == S @ Marpa) @ Start of Recognizer Step t+
\family default
1.
By the definition of the Recognizer, the Induction Hypothesis and the Induction
Limit Graf.
\end_layout
\begin_layout Standard
\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer Step t+1
\family default
.
By the just previous statement, the Induction Limit Graf, and the Earley
Set Processing Lemma with
\family typewriter
(n := s+1)
\family default
.
\end_layout
\begin_layout Standard
\noun on
This completes the induction
\noun default
.
\end_layout
\begin_layout Standard
\noun on
This proves the lemma
\noun default
.
\end_layout
\begin_layout Standard
\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer Loop(SIZE(x) - 1)
\family default
.
By the lemma with
\family typewriter
s := SIZE(x)-1
\family default
.
\end_layout
\begin_layout Standard
\family typewriter
(S @ AH2002 == S @ Marpa) @ Start of Recognizer Loop(SIZE(x))
\family default
.
By the graf just previous and the loop definitions.
SIZE(x) is a pass by the definition of the Recognizer.
\end_layout
\begin_layout Standard
\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer Loop(SIZE(x))
\family default
.
By the loop definitions and the previous graf.
SIZE(x) is the final pass of Recognizer Loop, by its definition.
The final pass is a no-op, so the values of S will not change in either
algorithm.
\end_layout
\begin_layout Standard
\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer
\family default
.
By the definition of the Recognizer and the previous graf.
\end_layout
\begin_layout Standard
\family typewriter
S @ End of AH2002 == S @ End of Marpa
\family default
.
By the graf just previous and by the definition of the two algorithms.
\end_layout
\begin_layout Standard
\noun on
QED
\noun default
.
\end_layout
\end_body
\end_document