The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
#LyX 1.6.4 created this file. For more info see http://www.lyx.org/
\lyxformat 345
\begin_document
\begin_header
\textclass article
\begin_preamble
% Copyright 2012 Jeffrey Kegler
% This document is not part of the Marpa or Marpa::XS source.
% Although it may be included with a Marpa distribution that
% is under an open source license, this document is
% not under that open source license.
% Jeffrey Kegler retains full rights.
\end_preamble
\use_default_options false
\language english
\inputencoding auto
\font_roman default
\font_sans default
\font_typewriter default
\font_default_family default
\font_sc false
\font_osf false
\font_sf_scale 100
\font_tt_scale 100

\graphics default
\paperfontsize 12
\spacing single
\use_hyperref false
\papersize default
\use_geometry true
\use_amsmath 1
\use_esint 1
\cite_engine basic
\use_bibtopic false
\paperorientation portrait
\leftmargin 0.75in
\topmargin 0.75in
\rightmargin 0.75in
\bottommargin 0.75in
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\defskip medskip
\quotes_language english
\papercolumns 1
\papersides 1
\paperpagestyle default
\tracking_changes false
\output_changes false
\author "" 
\author "" 
\end_header

\begin_body

\begin_layout Title
Marpa Correctness Proof
\end_layout

\begin_layout Author
Jeffrey Kegler
\end_layout

\begin_layout Abstract
This is a 
\begin_inset Quotes eld
\end_inset

correctness
\begin_inset Quotes erd
\end_inset

 proof for the Marpa algorithm.
 
\begin_inset Quotes eld
\end_inset

Correctness
\begin_inset Quotes erd
\end_inset

 is shown by demonstrating equivalence to the Aycock & Horspool's modification
 of Earley's algorithm in their 2002 paper.
\end_layout

\begin_layout Abstract
\begin_inset CommandInset toc
LatexCommand tableofcontents

\end_inset


\end_layout

\begin_layout Section
THIS IS A DRAFT
\end_layout

\begin_layout Standard
This paper is in late stages, but not yet finished.
 It will contain many errors of the kind that the last drafts should remove.
\end_layout

\begin_layout Section
Introduction
\end_layout

\begin_layout Standard
The paper proves the equivalence of Marpa to Aycock and Horspool's 2002
 revision of Earley's algorithm (AH2002).
 The intent is to establish confidence that the Marpa algorithm is correct.
 The reader is assumed to be familiar with standard grammar notation, LR(0)
 automata, Earley's algorithm and Aycock and Horspool's 2002 paper
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
Aycock and Horspool 2002
\end_layout

\end_inset

.
\end_layout

\begin_layout Section
The Two Algorithms
\end_layout

\begin_layout Standard
In this paper, the logic of both the Marpa and the AH2002 algorithm is divided
 among five sections
\end_layout

\begin_layout Itemize

\family typewriter
Initializer
\end_layout

\begin_layout Itemize

\family typewriter
Recognizer
\end_layout

\begin_layout Itemize

\family typewriter
Earley Set Processing
\end_layout

\begin_layout Itemize

\family typewriter
Scanner
\end_layout

\begin_layout Itemize

\family typewriter
Completer
\end_layout

\begin_layout Standard
The 
\family typewriter
Initializer
\family default
, 
\family typewriter
Recognizer
\family default
, 
\family typewriter
Scanner
\family default
 and 
\family typewriter
Completer
\family default
 are the same for both algorithms.
 
\family typewriter
Marpa
\family default
 and 
\family typewriter
AH2002
\family default
 differ only in their 
\family typewriter
Earley Set Processing
\family default
.
\end_layout

\begin_layout Standard
Pseudocode is provided for 
\family typewriter
Recognizer
\family default
, 
\family typewriter
Scanner
\family default
, 
\family typewriter
Completer
\family default
 and each algorithm's 
\family typewriter
Earley Set Processing
\family default
.
 All processing is deterministic.
 Variables are either global to the algorithm or local to their section.
\end_layout

\begin_layout Standard
Sections 
\begin_inset Quotes eld
\end_inset

call
\begin_inset Quotes erd
\end_inset

 other sections.
 When called, some sections take arguments, which become input variables
 in the called section.
 Input variables are local to the section, and their semantics is call-by-value.
 Despite the similarity to the semantics of subroutines in certain languages,
 no implementation is implied.
 Sections do not return a value.
\end_layout

\begin_layout Section
Earley Sets
\end_layout

\begin_layout Standard
Each Earley set is an ordered list of Earley items.
 In AH2002 and Marpa, every Earley item is a 
\family typewriter
(state, parent)
\family default
 duple, where 
\family typewriter
state
\family default
 is an AHFA state.
 AHFA states are states of the split 
\begin_inset Formula $\epsilon$
\end_inset

-DFA automatons first described by Aycock and Horspool.
 In this document a split 
\begin_inset Formula $\epsilon$
\end_inset

-DFA automaton is called an AHFA (Aycock-Horspool Finite Automaton) and
 its states are called AHFA states.
 Details of AHFA's and their states are not relevant to this proof and may
 be found in Aycock and Horspool 2002.
\end_layout

\begin_layout Standard
In the 
\family typewriter
(state, parent)
\family default
 Earley item duple, 
\family typewriter
parent
\family default
 is a parent pointer.
 A parent pointer is a pointer to an Earley set, possibly the same one of
 which the Earley item is a member.
\end_layout

\begin_layout Subsection
Worklists
\end_layout

\begin_layout Standard
In both the AH2002 and Marpa algorithms, the Earley sets need to be arrays
 that obey the semantics of worklists.
 Worklists are arrays which contain work items.
 Work items in a work list never disappear, unless the entire work list
 goes out of scope.
 Worklists only change in one way -- by the addition of new work items at
 the end of the work list.
\end_layout

\begin_layout Standard
A dynamic array can be implemented so that is never changed except to grow
 it at the end.
 Such an array would provide worklist semantics.
\end_layout

\begin_layout Subsection
AHFA States Versus LR(0) Items
\end_layout

\begin_layout Standard
The most visible difference between AH2002 and the standard Earley algorithm
 is in the form of the Earley items.
 In both AH2002 and the standard Earley algorithm, Earley items are duples.
 Also, in both AH2002 and the standard Earley algorithm, the second element
 of the duple is a parent pointer.
\end_layout

\begin_layout Standard
But the first element of the duple in AH2002 is not the same as in the standard
 Earley algorithm.
 In the standard Earley algorithm, the first element of the duple is an
 LR(0) 
\series bold
item
\series default
.
 In the AH2002 algorithm the first element of the Earley item is an AHFA
 
\series bold
state
\series default
.
 An AHFA state is a 
\series bold
set
\series default
 of one or more LR(0) items.
\end_layout

\begin_layout Standard
Working with sets of LR(0) items instead of single items makes the AH2002
 algorithm more complex, but it also makes it much more efficient in practice.
 Earley items in the Marpa algorithm are the same as in the AH2002 algorithm
 -- duples of AHFA state and parent pointer.
\end_layout

\begin_layout Section
Initializer
\end_layout

\begin_layout Standard
In both Marpa and AH2002, the Initializer accepts a token stream and a grammar
 as input.
 The token stream is an array of 
\family typewriter
n
\family default
 input tokens
\family roman
\series medium
\shape up
\size normal
\emph off
\bar no
\noun off
\color none
, 
\family typewriter
\series default
\shape default
\size default
\emph default
\bar default
\noun default
\color inherit
x[1], x[2], ..., x[n]
\family default
.
 
\end_layout

\begin_layout Standard
In each algorithm, the Recognizer Main Processing initializes 
\family typewriter
S[0]
\family default
 with Earley items of the form 
\family typewriter
(start_state, 0)
\family default
 where 
\family typewriter
start_state
\family default
 is one of the AHFA start states and the parent pointer points to Earley
 set 0.
 There will be one or two start states, depending on the grammar.
\end_layout

\begin_layout Subsection
Pure Functions of the Grammar
\end_layout

\begin_layout Standard
The Initializer creates the functions 
\family typewriter
COMPLETED
\family default
, 
\family typewriter
LHS
\family default
 and 
\family typewriter
GOTO
\family default
.
 These are pure functions which depend only on the grammar.
 These are partial functions with a finite and, in fact, very limited domain.
 Since these functions depend only on the grammar, they may be precomputed.
 Since they are only defined for a fixed list of arguments, these functions
 may be implemented as table lookups.
\end_layout

\begin_layout Standard

\family typewriter
LHS
\family default
 is a function from a rule to its left hand side symbol.
 
\family typewriter
LHS
\family default
 is defined for every rule of the grammar.
\end_layout

\begin_layout Standard

\family typewriter
COMPLETED
\family default
 is a function from an AHFA state to a list of the rules which have completed
 LR(0) items in that AHFA state.
 A completed LR(0) item is one with the dot at the end.
 The list of rules returned by 
\family typewriter
COMPLETED
\family default
 may be empty.
\end_layout

\begin_layout Standard

\family typewriter
GOTO
\family default
 is a transition function from an AHFA state and a symbol (which may be
 undefined) to another AHFA state.
 Most AHFA states only have transitions on a few of the grammar's symbols,
 so for many combinations of 
\family typewriter
state
\family default
 and 
\family typewriter
symbol
\family default
, the result of 
\family typewriter
GOTO(state, symbol)
\family default
 will be undefined.
\end_layout

\begin_layout Section
Recognizer
\end_layout

\begin_layout Standard
The Recognizer builds an array of 
\family typewriter
n+1
\family default
 Earley sets, 
\family typewriter
S[0], S[1], ..., S[SIZE(x)]
\family default
, where 
\family typewriter
SIZE(x)
\family default
 is the size of the token stream array, 
\family typewriter
x
\family default
.
 In both algorithms, the Earley sets need to be implemented so they can
 be treated as worklists.
\end_layout

\begin_layout Standard
In both Marpa and AH2002, the Recognizer Main Processing then does the Earley
 Set Processing.
 The Earley Set Processing is performed for each of the Earley sets from
 0 to 
\family typewriter
SIZE(x)
\family default
, in order.
\end_layout

\begin_layout Standard
\begin_inset Float algorithm
placement h
wide false
sideways false
status open

\begin_layout Plain Layout
\begin_inset listings
inline false
status open

\begin_layout Plain Layout

Recognizer Loop: for every (i) in (0 ..
 SIZE(x)) {
\end_layout

\begin_layout Plain Layout

    do Earley Set i Processing
\end_layout

\begin_layout Plain Layout

}
\end_layout

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
Recognizer
\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Section
The AH2002 Algorithm
\end_layout

\begin_layout Standard
In the proofs, it is convenient to refer to a 
\family typewriter
Scanner Loop
\family default
 and a 
\family typewriter
Completer Loop
\family default
 for both 
\family typewriter
AH2002
\family default
 and 
\family typewriter
Marpa
\family default
.
 
\family typewriter
Marpa
\family default
 has separate Scanner and Completer Loops defined in its Earley Set Processing
 pseudocode.
 For 
\family typewriter
AH2002
\family default
, both the 
\family typewriter
Scanner Loop
\family default
 and the 
\family typewriter
Completer Loop
\family default
 is defined as equivalent to the 
\family typewriter
Combined Loop
\family default
.
\end_layout

\begin_layout Standard

\noun on
Definition
\noun default
: 
\family typewriter
Scanner Loop @ AH2002 @ Earley Set i Processing
\family default
 is defined as 
\family typewriter
Combined Loop @ AH2002 @ Earley Set i Processing
\family default
.
\end_layout

\begin_layout Standard

\noun on
Definition
\noun default
: 
\family typewriter
Completer Loop @ AH2002 @ Earley Set i Processing
\family default
 is defined as 
\family typewriter
Combined Loop @ AH2002 @ Earley Set i Processing
\family default
.
\end_layout

\begin_layout Standard
\begin_inset Float algorithm
placement h
wide false
sideways false
status open

\begin_layout Plain Layout
\begin_inset listings
inline false
status open

\begin_layout Plain Layout

Input variable: i
\end_layout

\begin_layout Plain Layout

\end_layout

\begin_layout Plain Layout

Combined Loop: for every (state, parent) in S[i] {
\end_layout

\begin_layout Plain Layout

    do Scanner( state, parent )
\end_layout

\begin_layout Plain Layout

    if parent != i {
\end_layout

\begin_layout Plain Layout

       do Completer( state, parent )
\end_layout

\begin_layout Plain Layout

    }
\end_layout

\begin_layout Plain Layout

}
\end_layout

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
AH2002 Earley Set Processing
\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Section
The Marpa Algorithm
\end_layout

\begin_layout Standard
The 
\family typewriter
Marpa
\family default
 algorithm is the 
\family typewriter
AH2002
\family default
 algorithm 
\begin_inset Quotes eld
\end_inset

upside-down and inside-out
\begin_inset Quotes erd
\end_inset

.
 It pulls the scanner and the completer out of the main loop and gives each
 its own loop.
 And 
\family typewriter
Marpa 
\family default
reverses the order -- the completer is called first to process all items
 in an Earley set, including those items added by the completer itself.
 After the completer is finished, the scanner is called to process all items
 in the Earley set.
 The scanner never adds items to the current Earley set, so the current
 Earley set, 
\family typewriter
S[i]
\family default
 remains constant in 
\family typewriter
Scanner Loop
\family default
.
\end_layout

\begin_layout Standard
\begin_inset Float algorithm
placement h
wide false
sideways false
status open

\begin_layout Plain Layout
\begin_inset listings
inline false
status open

\begin_layout Plain Layout

Input variable: i
\end_layout

\begin_layout Plain Layout

\end_layout

\begin_layout Plain Layout

Completer Loop: for every (state, parent) in S[i] {
\end_layout

\begin_layout Plain Layout

    if parent != i {
\end_layout

\begin_layout Plain Layout

       do Completer(state, parent)
\end_layout

\begin_layout Plain Layout

    }
\end_layout

\begin_layout Plain Layout

} # End of Completer Loop
\end_layout

\begin_layout Plain Layout

\end_layout

\begin_layout Plain Layout

Scanner Loop: for every (state, parent) in S[i] {
\end_layout

\begin_layout Plain Layout

    do Scanner(state, parent)
\end_layout

\begin_layout Plain Layout

} # End of Scanner Loop
\end_layout

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
Marpa Earley Set Processing
\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Section
Scanner
\end_layout

\begin_layout Standard
The 
\family typewriter
Scanner
\family default
 takes three input variables.
 The first two are the elements of an Earley item: 
\family typewriter
state
\family default
 is an LR(0) state, and 
\family typewriter
parent
\family default
 is the index of an Earley set.
 The third input variable, i, is the index of the current Earley set.
\end_layout

\begin_layout Standard

\family typewriter
k
\family default
 and 
\family typewriter
nk
\family default
 are temporaries, local to the subroutine, which hold AHFA state values
 returned by 
\family typewriter
GOTO
\family default
.
 
\family typewriter
S
\family default
 is a global array containing the Earley sets.
 
\family typewriter
x
\family default
 is global array containing the tokens.
\end_layout

\begin_layout Standard
\begin_inset Float algorithm
placement h
wide false
sideways false
status open

\begin_layout Plain Layout
\begin_inset listings
inline false
status open

\begin_layout Plain Layout

Input variables: state, parent, i
\end_layout

\begin_layout Plain Layout

\end_layout

\begin_layout Plain Layout

set k = GOTO(state, x[i+1])
\end_layout

\begin_layout Plain Layout

if state k is not defined {
\end_layout

\begin_layout Plain Layout

    return
\end_layout

\begin_layout Plain Layout

}
\end_layout

\begin_layout Plain Layout

add (k, parent) to S[i+1] at end, if it is not already present
\end_layout

\begin_layout Plain Layout

set nk = GOTO(k, undefined)                                   
\end_layout

\begin_layout Plain Layout

if state nk is not defined {
\end_layout

\begin_layout Plain Layout

    return
\end_layout

\begin_layout Plain Layout

}
\end_layout

\begin_layout Plain Layout

add (nk, i+1) to S[i+1] at end, if it is not already present
\end_layout

\begin_layout Plain Layout

return
\end_layout

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
Scanner
\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Section
Completer
\end_layout

\begin_layout Standard
The 
\family typewriter
Completer
\family default
 takes three input variables.
 The first two are the elements of an Earley item: 
\family typewriter
state
\family default
 is an LR(0) state, and 
\family typewriter
parent
\family default
 is the index of an Earley set.
 The third input variable, i, is the index of the current Earley set.
\end_layout

\begin_layout Standard

\family typewriter
k
\family default
 and 
\family typewriter
nk
\family default
 are temporaries, local to the subroutine, which hold LR(0) state values
 returned by 
\family typewriter
GOTO
\family default
.
 
\family typewriter
rule
\family default
 is a rule, returned by 
\family typewriter
COMPLETED
\family default
.
 
\family typewriter
S
\family default
 is a global array containing the Earley sets.
\end_layout

\begin_layout Standard
\begin_inset Float algorithm
placement h
wide false
sideways false
status open

\begin_layout Plain Layout
\begin_inset listings
inline false
status open

\begin_layout Plain Layout

Input variables: state, parent, i
\end_layout

\begin_layout Plain Layout

\end_layout

\begin_layout Plain Layout

foreach rule in COMPLETED(state) {
\end_layout

\begin_layout Plain Layout

    foreach ( pstate, pparent ) in S[parent] {
\end_layout

\begin_layout Plain Layout

    set k = GOTO( pstate, LHS(rule) )
\end_layout

\begin_layout Plain Layout

    if k is defined {
\end_layout

\begin_layout Plain Layout

         add ( k, pparent ) to S[i] at end, if it is not already present
\end_layout

\begin_layout Plain Layout

         set nk = GOTO( k, undefined )
\end_layout

\begin_layout Plain Layout

         if nk is defined {
\end_layout

\begin_layout Plain Layout

              add ( nk, i ) to S[i] at end, if it is not already present
\end_layout

\begin_layout Plain Layout

         }
\end_layout

\begin_layout Plain Layout

     } # if k is defined
\end_layout

\begin_layout Plain Layout

}  # foreach rule
\end_layout

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
Completer
\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Section
The Proof: Short Version
\end_layout

\begin_layout Standard
After, Initialization, the only changes to globals are to the Earley sets,
 and these occur in the Earley Set Processing.
 The Processing of Earley set i can change S[i] (called the current Earley
 set) and S[i+1] (the next Earley set).
 The proof is organized around these facts.
\end_layout

\begin_layout Standard
First, two lemmas are proved showing that if, before each Earley Set Processing
 step, the Earley sets in both algorithms are identical, then, after the
 Earley Set Processing Step, the current Earley sets and the next Earley
 sets remain identical .
\end_layout

\begin_layout Subsection
The Current Earley Set Lemma
\end_layout

\begin_layout Standard

\noun on
Statement of the Lemma
\noun default
: Assume that, before each Earley set is processed, the Earley sets in AH2002
 are identical to the Earley sets in Marpa.
 Then, after Earley Set i Processing, S[i] in AH2002 is identical to S[i]
 in Marpa.
\end_layout

\begin_layout Standard

\noun on
Proof
\noun default
: In the Earley Set Processing, S[i] will only be changed by the Completer.
 The Completer can be treated as a pure deterministic functions whose inputs
 are its input variables (state, parent and i), the current Earley set itself
 (S[i]) and the pure functions of the grammar (GOTO, LHS and COMPLETED).
 GOTO, LHS, COMPLETED do not change after Initialization and the input variable
 i will not change throughout the Earley Set Processing, so they may treated
 as constants for this purpose.
\end_layout

\begin_layout Standard
The input variables (state, parent) and the current Earley set (S[i]) do
 change but, in both AH2002 and Marpa, the calls to the Completer will be
 with same values for these and in the same sequence.
 This can be shown by induction on the calls to the Completer.
\end_layout

\begin_layout Standard
There may be zero such calls, and if this is the case, S[i] after Earley
 Set i Processing will clearly be unchanged from S[i] before Earley Set
 i Processing.
 Assume there have been n calls to the completer.
 By the induction hypothesis, after these n calls, S[i] is identical in
 both algorithms.
 The input variables state and parent depend only on S[i], and on i.
 i was already shown to be constant during the Earley Set i Processing.
 S[i] is identical in both algorithms by the induction hypothesis.
 The Completer is the same in both algorithms.
 Therefore, since the Completer and its inputs are identical, and the Completer
 is deterministic, the results of the Completer will be the same in both
 algorithms.
\end_layout

\begin_layout Standard
The induction above assumed that there is a n+1'th call to the Completer
 in both algorithms.
 This actually depends on the domain for the Completer Loop.
 But this domain is also the current Earley set, S[i].
 So by the same induction as above, it is not possible there will be a n+1'th
 call to the Completer in one algorithm, without there being an n+1'th call
 to the Completer in the other algorithm.
 In other words, either there will be a n+1'st call to the Completer in
 both algorithms, or the n'th call to the Completer in both algorithms will
 have been the final one.
\end_layout

\begin_layout Standard
This completes the induction and shows that, after the Earley Set i Processing,
 the current Earley sets (S[i]) in both algorithms must be identical to
 each other.
 QED.
\end_layout

\begin_layout Subsection
The Next Earley Set Lemma
\end_layout

\begin_layout Standard

\noun on
Statement of the Lemma
\noun default
: Assume that, before each Earley set is processed, the Earley sets in AH2002
 are identical to the Earley sets in Marpa.
 Then, after Earley Set i Processing, S[i+1] in AH2002 is identical to S[i+1]
 in Marpa.
\end_layout

\begin_layout Standard

\noun on
Proof
\noun default
: First, recall the Current Earley Set Lemma.
 By it, we know that 
\series bold
after
\series default
 Earley Set i Processing, the 
\series bold
current
\series default
 Earley set (S[i]) is identical in both AH2002 and Marpa.
 We will make use of this fact below.
\end_layout

\begin_layout Standard
In the Earley Set Processing, S[i+1] will only be changed by the Scanner.
 The Scanner can be treated as a pure deterministic function whose inputs
 are its input variables (state, parent and i), the next Earley set itself
 (S[i+1]), the token stream (x) and a pure function of the grammar (GOTO).
 GOTO and x do not change after Initialization and the input variable i
 will not change throughout the Earley Set Processing, so they may treated
 as constants for this purpose.
\end_layout

\begin_layout Standard
The input variables (state, parent) and the next Earley set (S[i+1]) do
 change but, in both AH2002 and Marpa, the calls to the Scanner will be
 with same values for these, will be made in the same sequence, and will
 have the same results.
 This can be shown by induction on the calls to the Scanner.
\end_layout

\begin_layout Standard
There may be zero such calls, and if this is the case, S[i+1] after Earley
 Set i Processing will clearly be unchanged from S[i+1] before Earley Set
 i Processing.
 Assume there have been n calls to the Scanner.
 By the induction hypothesis, after these n calls, S[i+1] is identical in
 both algorithms.
\end_layout

\begin_layout Standard
The input variables state and parent are the n+1'st work item in the current
 Earley set S[i].
 Call this work item S[i][n+1].
 As a work item, once S[i][n+1] is added to a work list it is never deleted
 and never changes.
 Since by the Current Earley Set Lemma, S[i] in AH2002 will be identical
 to S[i] in Marpa after Earley Set i Processing, if S[i][n+1] exists in
 both algorithms it must be identical.
\end_layout

\begin_layout Standard
S[i+1] is identical in both algorithms by the induction hypothesis.
 The Scanner is the same in both algorithms.
 Therefore, since the Scanner and its inputs are identical, and the Scanner
 is deterministic, the results of the Scanner will be the same in both algorithm
s.
\end_layout

\begin_layout Standard
The induction above assumed that there is a n+1'th call to the Scanner in
 both algorithms.
 This actually depends on the domain for the Scanner Loop.
 This domain is the current Earley set, S[i].
\end_layout

\begin_layout Standard
But what if there was an S[i][n+1] in AH2002 but no S[i][n+1] in Marpa,
 or vice versa? We can show this won't happen, by a reduction to absurdity.
 Suppose, just after the n'th call to the Scanner, S[i][n+1] exists in AH2002,
 but not in Marpa.
 By the Current Earley Set Lemma S[i] in both algorithms is identical after
 the Earley Set i Processing.
 But if S[i][n+1] does not exist for the Scanner Loop in Marpa, examination
 of the pseudocode will show that S[i][n+1] will not exist in Marpa at the
 End of Earley Set i Processing either.
 On the other hand, if S[i][n+1] exists at the End of Earley Set i Processing
 in AH2002, it will still exist at the End of Earley Set i Processing in
 AH2002.
 This means that S[i] at the End of Earley Set i Processing in AH2002 will
 not be the same as S[i] at the End of Earley Set i Processing in Marpa.
\end_layout

\begin_layout Standard
But, by the Current Earley Set Lemma, at the End of Earley Set i Processing,
 S[i] in each of the two algorithms must be identical to S[i] in the other.
 This means that the assumption for the reduction to absurdity in the paragraph
 above must be false.
 For the reduction to absurdity, we assumed that, just after the n'th call
 to the Scanner, there was an S[i][n+1] in AH2002, but not in Marpa.
 Therefore if, just after the n'th call to the Scanner, there is an S[i][n+1]
 in the AH2002, there must also be one in Marpa.
\end_layout

\begin_layout Standard
The same reduction to absurdity works in reverse, and shows the if, just
 after the n'th call to the Scanner, there is an S[i][n+1] in Marpa, there
 must also be one in AH2002.
 Putting the two together, just after the n'th call to the Scanner, there
 is an S[i][n+1] in AH2002 if and only if there is an S[i][n+1] in Marpa.
\end_layout

\begin_layout Standard
This means that it is not possible there will be a n+1'th call to the Scanner
 in one algorithm, without there being an n+1'st call to the Scanner in
 the other algorithm.
 In other words, either there will be a n+1'st call to the Scanner in both
 algorithms, or the n'th call to the Scanner in both algorithms will have
 been the final one.
\end_layout

\begin_layout Standard
This completes the induction and shows that, after the Earley Set i Processing,
 the next Earley sets (S[i+1]) in both algorithms must be identical to each
 other.
 QED.
\end_layout

\begin_layout Subsection
Showing Equivalence
\end_layout

\begin_layout Standard
We say that AH2002 and Marpa are equivalent if, when they are run on the
 same token streams and grammars, they produce the same Earley sets.
 Given the Current Earley Set Lemma and the Next Earley Set Lemma, the proof
 follows quickly.
 The proof is by induction on the Earley Set Processing.
\end_layout

\begin_layout Standard
Initialization is the same in AH2002 and Marpa, and the input to the algorithms
 (their tokens streams and grammars) are assumed to be identical.
 Initialization is deterministic, so before the Processing of Earley Set
 0, all globals are identical.
\end_layout

\begin_layout Standard
Suppose before the Earley Set n Processing, all globals are identical.
 We know from examination of the pseudocode that, during Earley Set n Processing
, the only globals which change are S[n] and S[n+1].
\end_layout

\begin_layout Standard
We have from the induction hypothesis that, before Earley Set n Processing,
 S[n] in AH2002 is identical is S[n] in Marpa.
 From this and the Current Earley Set Lemma, we can conclude that, after
 Earley Set n Processing, S[n] in each of the algorithms will be identical
 to S[n] in the other algorithm.
\end_layout

\begin_layout Standard
We have from the induction hypothesis that, before Earley Set n Processing,
 S[n+1] in AH2002 is identical is S[n+1] in Marpa.
 From this, the Current Earley Set Lemma and the Next Earley Set Lemma,
 we can conclude that, after Earley Set n Processing, S[n+1] in each of
 the algorithms will be identical to S[n+1] in the other algorithm.
\end_layout

\begin_layout Standard
Above we have that:
\end_layout

\begin_layout Itemize
All globals remains constant during Earley Set Processing except S[n] and
 S[n+1].
\end_layout

\begin_layout Itemize
After Earley Set n Processing, S[n] in each of the algorithms will be identical
 to S[n] in the other algorithm.
\end_layout

\begin_layout Itemize
After Earley Set n Processing, S[n+1] in each of the algorithms will be
 identical to S[n+1] in the other algorithm.
\end_layout

\begin_layout Standard
Combining these three, we have that if, after Earley Set n Processing, all
 globals in AH2002 will be identical to the corresponding global in Marpa.
 The point after Earley Set n Processing is the same as the point before
 Earley Set n+1 Processing.
 This proves the induction.
\end_layout

\begin_layout Standard
The Recognizer will process l+1 Earley sets, where l is the length of the
 token stream.
 From the induction, we know that, after the processing of the l+1 Earley
 sets, the Earley sets in Marpa will be identical to those in AH2002.
 This is the equivalence we were after.
\end_layout

\begin_layout Standard
QED.
\end_layout

\begin_layout Section
Why a Detailed Proof?
\end_layout

\begin_layout Standard
The rest of this paper repeats the  proof already given, more carefully
 and in more detail.
 It describes some new notations, useful for detailed proofs of this kind.
 Readers, if satisfied with what they have seen so far, will not necessarily
 want to read any further.
\end_layout

\begin_layout Standard
The form of the short proof above is similar to that of the correctness
 proofs as they usually appear in journals.
 Proofs in this form are problematic.
 English-language descriptions of program locations, even when carefully
 written, are verbose.
 Even so, these descriptions are always in danger of being ambiguous.
\end_layout

\begin_layout Standard
More importantly, the reader can not easily confirm that all crucial details
 are being established.
 For example, in this proof, the existence or non-existence of loop passes
 and worklist items at certain points is as crucial as any issue in the
 proof, but very easy to overlook.
\end_layout

\begin_layout Standard
Short, open-form, English-language proofs of this kind, while they dominate
 the academic literature, are error-prone and hard to read.
 In the journals, perhaps reasons of space force their acceptance.
 Page-count in academic journals is a scarce resource, inelastic to the
 demands of academics.
 Detailed proofs would perhaps double the size of the corresponding articles.
 Doubling the page count means halving the number of articles published.
 Everyone in the field realizes that that one or more of the articles left
 unpublished might well be his own.
\end_layout

\begin_layout Standard
Web space is not nearly so expensive as space in printed academic journals.
 The rest of this paper contains the detailed working out of the correctness
 proof just given in short form.
 The short proof was written after, and based on, the detailed proof.
\end_layout

\begin_layout Section
Notation for the Detailed Proof
\end_layout

\begin_layout Subsection
Grafs
\end_layout

\begin_layout Standard
Most detailed proofs are in table form, with one equation per line.
 Equations are right aligned and their explanations left aligned.
 Program correctness proofs require a degree of explanation which easily
 overwhelms the one-line-per-equation format.
\end_layout

\begin_layout Standard
This proof is organized with one equation per paragraph, usually abbreviated
 
\begin_inset Quotes eld
\end_inset

graf
\begin_inset Quotes erd
\end_inset

.
 The first sentence is an equation, and the others justify its derivation
 or provide explanations.
\end_layout

\begin_layout Subsection
Time-Sets
\end_layout

\begin_layout Standard
This proof I introduce time-sets notation.
 Time-set notation can be looked at as an abbreviation of English, or as
 a semi-formalized, notation.
 I hope it is an improvement in precision.
 I would also claim it is more readable than its expansion into ordinary
 English would be.
\end_layout

\begin_layout Standard
A key to this notation is that the idea of sequence is an afterthought in
 time-set notation.
 Time-sets are unordered sets of point in the execution -- time-points.
 Statements about their sequence -- which time-point is before what other
 time-point, can obviously be important in analyzing programs.
 But I think premature introduction of the sequencing into the semantics
 is counterproductive.
\end_layout

\begin_layout Standard
The notation for locations in the code uses sets of time-points.
 The points in time, or time-points, are points in the processing -- not
 locations in the pseudocode.
 Every operation of any kind separates time-points, so many time-points
 can correspond to the same pseudo-code location.
 Additionally, if a lexical location in the pseudocode is passed through
 several times, as it would be in a loop, each pass creates a new set of
 time-points for that location.
\end_layout

\begin_layout Standard
For time-points to be separate, some operation must happen between them.
 This means that sometimes the same time-point can be identified with two
 different lexical locations in the pseudocode, and have two different names.
 This occurs frequently in loops.
 More on this below.
\end_layout

\begin_layout Subsection
Names of Sets
\end_layout

\begin_layout Subsubsection
Algorithms
\end_layout

\begin_layout Standard
Each algorithm's name is used for the time-set of all the time-points in
 that algorithm.
 So AH2002 is the name of the time-set of all the time-points in the execution
 of the AH2002 algorithm.
 Marpa is the name of the time-set of all the time-points in the execution
 of the Marpa algorithm.
 Start of Marpa is the time point before any operations in the Marpa algorithm.
 End of Marpa is the time-point after any operations.
\end_layout

\begin_layout Subsubsection
Pseudocode Sections
\end_layout

\begin_layout Standard
Each pseudocode section's name is used for the time-set of all time-points
 in that pseudo-code section, including all time-points in either algorithm.
 Recognizer is the name of the time-set of all time-points in the Recognizer,
 whether for AH2002 or Marpa.
 When it is useful to distinguish the @-notation can be used: 
\family typewriter
Recognizer @ AH2002
\family default
 or 
\family typewriter
Recognizer @ Marpa
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
Start of Recognizer
\family default
 is the set of the time-points before any operation in the Recognizers,
 including both algorithms.
 
\family typewriter
End of Recognizer @ Marpa
\family default
 is the singleton set containing the time-point after all operations in
 the Marpa algorithm.
\end_layout

\begin_layout Standard
Pseudocode sections can be called multiple times.
 For example, 
\family typewriter
Scanner @ Marpa
\family default
 refers to all time-points in any call of 
\family typewriter
Scanner
\family default
 in the 
\family typewriter
Marpa
\family default
 algorithm.
\end_layout

\begin_layout Standard
Where it is useful to specifically distinguish one call to a section, this
 can be done in the time-set's name.
 For example, the call the 
\family typewriter
Earley Set Processing
\family default
 in the 
\family typewriter
AH2002
\family default
 algorithm for Earley Set 
\family typewriter
s
\family default
 is 
\family typewriter
Earley Set s Processing @ AH2002
\family default
.
\end_layout

\begin_layout Subsection
Loops
\end_layout

\begin_layout Standard
Loops have not only a 
\family typewriter
Start
\family default
 and 
\family typewriter
End
\family default
, but a 
\family typewriter
Top
\family default
 and a 
\family typewriter
Bottom
\family default
.
 The 
\family typewriter
Start of Loop L
\family default
 is before execution of any instructions in the loop.
 The 
\family typewriter
End of Loop L
\family default
 is after execution of any instructions.
 The 
\family typewriter
Top of Loop L(c)
\family default
 is before the execution of pass 
\family typewriter
c
\family default
 through 
\family typewriter
Loop L
\family default
.
 The 
\family typewriter
Bottom of Loop L(c)
\family default
 is after the execution of pass 
\family typewriter
c 
\family default
through Loop 
\family typewriter
L
\family default
.
\end_layout

\begin_layout Standard
Passes through loops are numbered starting at 0.
 The final test which terminates a loop is considered a pass, but a pass
 which does not execute any instructions inside the loop.
 A loop over the numbers from 0 to 10 will have 12 passes, with pass 0 being
 the first pass, and pass 11 being the last pass.
 Pass 0 will have the loop variable with the value 0.
 In pass n, the value of the loop variable will be n, unless n is the last
 pass.
 In the last pass, only the test is performed -- the fails, none of the
 instructions inside the loop are executed, and the loop terminates.
 Where z is the last pass, there is still considered to be a both 
\family typewriter
Top of Loop(z)
\family default
 and 
\family typewriter
Bottom of Loop(z)
\family default
.
\end_layout

\begin_layout Standard

\noun on
Definition
\noun default
: Let L be the name of a loop.
 
\family typewriter
DOMAIN(L)
\family default
 is the array over which loop 
\family typewriter
L
\family default
 iterates.
\end_layout

\begin_layout Standard
In this definition, it is important that L is not a timeset -- it is the
 name of a loop.
 Also, it is important to note the value of 
\family typewriter
DOMAIN
\family default
 can change if the loop is over a worklist.
\end_layout

\begin_layout Standard

\noun on
Definition
\noun default
: Let L be the name of a loop.
 Then,
\family typewriter
 PASS(p, L)
\family default
 
\begin_inset Formula $\Leftrightarrow$
\end_inset

 
\family typewriter
0 <= p <= SIZE(DOMAIN(L) @ End of L))
\family default
.
\end_layout

\begin_layout Standard
Informally, 
\family typewriter
PASS(p,L)
\family default
 is true if 
\family typewriter
p
\family default
 is the number of a valid pass of 
\family typewriter
L
\family default
.
 If 
\family typewriter
D == DOMAIN(L)
\family default
, then the loop variable for pass 
\family typewriter
p
\family default
 is 
\family typewriter
D[p]
\family default
.
\end_layout

\begin_layout Standard
For any loop 
\family typewriter
L, PASS(p, L)
\family default
 
\begin_inset Formula $\wedge$
\end_inset

 
\family typewriter
PASS(p+1, L)
\family default
 
\begin_inset Formula $\Leftrightarrow$
\end_inset

 
\family typewriter
(Bottom(p) of L == Top(p+1) of L
\family default
).
\end_layout

\begin_layout Standard

\noun on
Definition
\noun default
:
\family typewriter
 FINAL(L) == SIZE( DOMAIN(L) @ End of L )
\end_layout

\begin_layout Subsection
Operations on Time-Sets
\end_layout

\begin_layout Standard
Subset: 
\family typewriter
S1 
\begin_inset Formula $\subseteq$
\end_inset

 S2
\end_layout

\begin_layout Standard
The notation remains semi-formal.
 @-notation is often written so that it's intended to be distributed over
 other notation.
 For example:
\end_layout

\begin_layout Standard

\family typewriter
(SIZE(x @ A) == SIZE(x @ B)) @ P
\end_layout

\begin_layout Standard
is an abbreviation for
\end_layout

\begin_layout Standard

\family typewriter
SIZE(x @ A @ P) == SIZE(x @ B @ P)
\end_layout

\begin_layout Standard
The general idea is that in 
\family typewriter
(expression) @ P
\family default
, where 
\family typewriter
expression
\family default
 is not a term or another time-set, 
\family typewriter
P 
\family default
should be 
\begin_inset Quotes eld
\end_inset

pulled inside
\begin_inset Quotes erd
\end_inset

 
\family typewriter
expression
\family default
.
 Rules for this are not explicitly formalized, but use of this syntax is
 limited to simple cases where it should be clear what is intended -- certainly
 far more clear than the usual English language description would be.
\end_layout

\begin_layout Section
Work List Lemma
\end_layout

\begin_layout Standard
Statement of the Lemma:
\end_layout

\begin_layout Standard
Let 
\family typewriter
L
\family default
 be a work list iteration.
\end_layout

\begin_layout Standard
Let 
\family typewriter
W
\family default
 be a work list.
\end_layout

\begin_layout Standard
Then 
\family typewriter
0 <= j <= SIZE( W @ End of L )
\begin_inset Formula $\Rightarrow$
\end_inset

PASS( j, L )
\end_layout

\begin_layout Standard
And 
\family typewriter
FINAL( SIZE( W @ End of L ), L)
\end_layout

\begin_layout Standard

\noun on
Proof
\noun default
: The proof is from the definitions of loop and work list.
 
\family typewriter
L
\family default
 is defined as being an iteration over every item in 
\family typewriter
W
\family default
, even those added during 
\family typewriter
L
\family default
, so that any item added to 
\family typewriter
W
\family default
 before 
\family typewriter
End of L
\family default
 must be included in the iteration.
\end_layout

\begin_layout Standard
Only one operation is allowed on 
\family typewriter
W
\family default
 -- adding items to the end.
 This operation does not move or delete any items, so no item in 
\family typewriter
W
\family default
 will ever be moved or deleted.
 That means any iterated item will be present in 
\family typewriter
W @( End of L)
\family default
, and in the same position as when it was iterated.
 Therefore, if there is a 
\family typewriter
W[j]@(End of L)
\family default
, the loop variable for pass j will be 
\family typewriter
W[j]
\family default
.
\end_layout

\begin_layout Standard

\noun on
QED.
\end_layout

\begin_layout Section
Globals
\end_layout

\begin_layout Subsection
The Globals Assumption
\end_layout

\begin_layout Standard
For the purpose of the proof in this paper, it is assumed that AH2002 and
 Marpa are initialized with the same input token stream and grammar.
\end_layout

\begin_layout Subsection
The Globals Lemmas
\end_layout

\begin_layout Standard
Statement of Lemma 1: Let G be a global.
\end_layout

\begin_layout Standard
Then G @ Start of Recognizer @ AH2002 == G @ Start of Recognizer @ Marpa.
\end_layout

\begin_layout Standard
Proof: From the definitions, we know that the Initialization code is shared
 by the two algorithms.
 From the Globals Assumption, we have that the results of Initialization
 depends only on the input tokens and the grammar.
\end_layout

\begin_layout Standard
QED.
\end_layout

\begin_layout Standard

\noun on
Statement of Lemma
\noun default
 2: Let G be global not in the Earley sets.
\end_layout

\begin_layout Standard
Let P be a time-set, P 
\begin_inset Formula $\subseteq$
\end_inset

Recognizer.
\end_layout

\begin_layout Standard
Let Q be a time-set, P 
\begin_inset Formula $\subseteq$
\end_inset

Recognizer.
\end_layout

\begin_layout Standard
Then For all G, where G is a global not in the Earley sets, G @ P == G @
 Q.
\end_layout

\begin_layout Standard
Proof: The pseudocode and the definition of the Recognizer show that the
 only globals that change in the Recognizer are those in the Earley Sets.
\end_layout

\begin_layout Standard

\noun on
Statement of Lemma 3
\noun default
: The Earley sets are changed only in the the following ways:
\end_layout

\begin_layout Itemize
S[0] is changed in Initialization;
\end_layout

\begin_layout Itemize
S[i] is changed in Completer @ Earley Set i Processing; and
\end_layout

\begin_layout Itemize
S[i+1] is changed in Scanner @ Earley Set i Processing.
\end_layout

\begin_layout Standard

\noun on
Proof
\noun default
: From the pseudocode and the definition of Initialization.
\end_layout

\begin_layout Standard
QED
\end_layout

\begin_layout Section
The Detailed Proof
\end_layout

\begin_layout Subsection
AH2002 Halting Axiom
\end_layout

\begin_layout Standard
In this paper, AH2002 will be assumed to halt for all finite grammars and
 input token streams.
 AH2002 in their paper do not explicitly prove that this is true in general,
 but it is at the very least a safe conjecture.
\end_layout

\begin_layout Subsection
The Completer Lemma
\end_layout

\begin_layout Standard
In this lemma A may be either Marpa or AH2002.
\end_layout

\begin_layout Standard

\noun on
Statement of the Lemma
\noun default
:
\end_layout

\begin_layout Standard
Assume p is a pass in the Completer Loop.
\end_layout

\begin_layout Standard
Assume that 
\family typewriter
((state, parent, i) @ AH2002 == (state, parent, i) @ Marpa) @ Top of Completer
 Loop(p)
\family default
.
\end_layout

\begin_layout Standard
Assume 
\family typewriter
(S[i] @ A == S[i] @ B) @ Top of Completer Loop(p)
\family default
.
\end_layout

\begin_layout Standard
Then 
\family typewriter
(S[i] @ A == S[i] @ B) @ Bottom of Completer Loop(p)
\family default
.
\end_layout

\begin_layout Standard

\noun on
Starting the proof
\noun default
:
\end_layout

\begin_layout Standard

\family typewriter
0 <= p <= SIZE(S[i] @ End of Completer Loop)
\family default
.
 By the Loop Range Property.
\end_layout

\begin_layout Standard
The proof proceeds in two cases -- the Final Case 
\family typewriter
(p == SIZE(S[i] @ End of Completer Loop))
\family default
 and the Kernel Case 
\family typewriter
(0 <= p < SIZE(S[i] @ End of Completer Loop))
\family default
.
\end_layout

\begin_layout Standard

\noun on
Kernel case
\noun default
:
\end_layout

\begin_layout Standard

\family typewriter
((state, parent, i) @ AH2002 == (state, parent, i) @ Marpa) @ Top of Completer
 Loop(p)
\family default
.
 By assumption in the statement of the lemma.
\end_layout

\begin_layout Standard

\family typewriter
((state, parent, i) @ AH2002 == (state, parent, i) @ Marpa) @ Top of Completer
 @ Completer Loop(p)
\family default
.
 From the statement just previous and inspection of the psuedocode.
 These variables are never changed between 
\family typewriter
Top of Completer Loop(p)
\family default
 and 
\family typewriter
Top of Completer @ Completer Loop(p)
\family default
.
\end_layout

\begin_layout Standard
For 
\family typewriter
G
\family default
 in LHS, GOTO, COMPLETED, 
\family typewriter
(G @ Marpa == G @ AH2002) @ Top of Completer @ Completer Loop(p)
\family default
.
 By the Globals Lemma.
\end_layout

\begin_layout Standard

\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ Start of Completer @ Completer Loop(p
\family default
).
 By assumption in the statement of the lemma.
\end_layout

\begin_layout Standard

\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ End of Completer @ Completer Loop(p)
\family default
.
 The Completer pseudocode shows that S[i] @ End of Completer is a deterministic
 function of S[i] itself, the Completer's input variables and these globals:
 LHS, GOTO, COMPLETED.
 By the previous 3 grafs, these are the same in both algorithms.
 The Completer is also the same in both algorithms.
 Therefore the effects of the Completer must be the same in both algorithms.
\end_layout

\begin_layout Standard

\family typewriter
(S[i] @ AH2002 == S[i+f] @ Marpa) @ Bottom of Completer Loop(p)
\family default
.
 From the pseudocode.
 The current Earley set is never changed between 
\family typewriter
Bottom of Completer @ Completer Loop(p)
\family default
 and 
\family typewriter
Bottom of Completer Loop(p)
\family default
.
\end_layout

\begin_layout Standard

\noun on
This proves the Kernel case
\noun default
.
\end_layout

\begin_layout Standard

\noun on
Final case
\noun default
:
\end_layout

\begin_layout Standard

\family typewriter
FINAL(p, Completer Loop)
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
(S[i] @ Marpa == S[i] @ AH2002) @ Bottom of Completer Loop(p)
\family default
.
 From the previous graf and an assumption in the statement of this lemma.
 By the definition of the final pass, it is a no-op.
 The Completer is not called and 
\family typewriter
S[i]
\family default
 remains to its value at the 
\family typewriter
Top Completer Loop(p)
\family default
.
\end_layout

\begin_layout Standard

\noun on
This proves the final case
\noun default
.
\end_layout

\begin_layout Standard
With the two cases proved, the Lemma is shown.
\end_layout

\begin_layout Standard
QED.
\end_layout

\begin_layout Subsection
The Completer Loop Lemma
\end_layout

\begin_layout Standard
When a task property Lemma is used in this Lemma as a justification, and
 no task is mentioned, the task is the Completer pseudocode, the loop is
 the Completer Loop and the offset is 1, and the Completer Task Lemma is
 to be considered part of the justification.
\end_layout

\begin_layout Standard

\noun on
Statement of the Lemma
\noun default
:
\end_layout

\begin_layout Standard
Let 
\family typewriter
p
\family default
 be a pass in 
\family typewriter
Completer Loop @ Earley Set i Processing
\family default
.
\end_layout

\begin_layout Standard
Assume for all globals G not in the Earley sets, 
\family typewriter
(G @ Marpa == G @ AH2002) @ Earley Set i Processing)
\family default
.
\end_layout

\begin_layout Standard
Assume 
\family typewriter
(i @ Marpa == i @ AH2002) @ Earley Set Processing
\family default
.
\end_layout

\begin_layout Standard
Assume 
\family typewriter
(S[i] @ Marpa == S[i] @ AH2002) @ Top(p) of Completer Loop
\family default
(p).
\end_layout

\begin_layout Standard
Then 
\family typewriter
(S[i] @ Marpa == S[i] @ AH2002) @ Bottom of Completer Loop(p)
\family default
.
\end_layout

\begin_layout Standard

\noun on
Proof:
\end_layout

\begin_layout Standard

\family typewriter
(S[i][p] @ AH2002 == S[i][p] @ Marpa) @ Top of Completer Loop(p)
\family default
.
 By assumptions in for the lemma.
\end_layout

\begin_layout Standard

\family typewriter
((state, parent) @ AH2002 == (state, parent) @ Marpa) @ Top of Completer
 Loop(p)
\family default
.
 From the statement just previous.
 
\family typewriter
(state, parent) == S[i][p]
\family default
 because both are the loop variable.
\end_layout

\begin_layout Standard

\family typewriter
(i @ AH2002 == i @ Marpa) @ Top of Completer Loop (p)
\family default
.
 From an assumption for this lemma.
\end_layout

\begin_layout Standard

\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ Top of Completer Loop(p)
\family default
.
 An assumption for this lemma.
\end_layout

\begin_layout Standard

\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ Bottom of Completer Loop(p)
\family default
.
 By the Completer Lemma with 
\family typewriter
p
\family default
 set to its value in this lemma, and by the three previous grafs.
\end_layout

\begin_layout Standard

\noun on
This proves the lemma.
\end_layout

\begin_layout Subsection
The Current Earley Set Lemma
\end_layout

\begin_layout Standard

\noun on
Statement of the Lemma
\noun default
:
\end_layout

\begin_layout Standard
Assume 
\family typewriter
0 <= i <= SIZE(x)
\family default
.
\end_layout

\begin_layout Standard
Assume 
\family typewriter
(S @ AH2002 == S @ Marpa) @ Start of Earley Set i Processing
\family default
.
\end_layout

\begin_layout Standard
Then, 
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ End of Earley Set i Processing
\family default
.
\end_layout

\begin_layout Standard

\noun on
Starting the Proof
\noun default
:
\end_layout

\begin_layout Standard

\noun on
Sublemma
\noun default
:
\end_layout

\begin_layout Standard
Assume 
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ Top of Completer Loop(0)
\family default
.
\end_layout

\begin_layout Standard
Assume 
\family typewriter
p <= SIZE(S[i]) @ AH2002 @ Bottom of Completer Loop(p)
\family default
 or 
\family typewriter
p <= SIZE(S[i]) @ Marpa @ Bottom of Completer Loop(p)
\family default
.
\end_layout

\begin_layout Standard
Then 
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ Bottom of Completer Loop(p)
\family default
.
\end_layout

\begin_layout Standard

\noun on
Note
\noun default
: In this sublemma, it is assumed in the statement of the sublemma that
 pass p exists for one algorithm, but not for the other.
 Loop operations are not defined for non-existent passes, and it must be
 proved that pass p exists for both AH2002 and Marpa.
\end_layout

\begin_layout Standard

\noun on
Starting the proof
\noun default
:
\end_layout

\begin_layout Standard

\noun on
Induction Property
\noun default
: 
\family typewriter
(S[i] @ A == S[i] @ Marpa) @ Bottom of Completer Loop(p)
\family default
, where the induction is on 
\family typewriter
p
\family default
.
\end_layout

\begin_layout Standard

\noun on
Induction
\noun default
 
\noun on
Basis
\noun default
: 
\family typewriter
(S[i] @ A == S[i] @ Marpa) @ Bottom of Completer Loop(0)
\family default
.
 By the assumptions in the statement of the sublemma and the Completer Loop
 Lemma with 
\family typewriter
p := 0
\family default
.
 Pass 0 exists for every loop, by definition.
\end_layout

\begin_layout Standard

\noun on
This proves the induction basis.
\end_layout

\begin_layout Standard

\noun on
Induction hypothesis
\noun default
:
\end_layout

\begin_layout Standard
Assume 
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ Bottom of Completer Loop(n)
\family default
.
\end_layout

\begin_layout Standard

\noun on
Inductive step
\noun default
: 
\end_layout

\begin_layout Standard
Note: Next it must be shown that 
\family typewriter
n+1
\family default
 is a pass in both AH2002 and Marpa.
 
\end_layout

\begin_layout Standard

\noun on
Subsublemma
\noun default
: Let 
\family typewriter
(A,B)
\family default
 be either 
\family typewriter
(AH2002, Marpa)
\family default
 or 
\family typewriter
(Marpa, AH2002)
\family default
.
\end_layout

\begin_layout Standard
Assume 
\family typewriter
PASS(n+1, Completer Loop @ A)
\family default
.
\end_layout

\begin_layout Standard
Then 
\family typewriter
PASS(n+1, Completer Loop @ B)
\family default
.
\end_layout

\begin_layout Standard
Starting the Proof:
\end_layout

\begin_layout Standard

\family typewriter
0 <= n < SIZE(DOMAIN(Completer Loop) @ Bottom of Completer Loop(n) @ A))
\family default
.
 By the assumption for the sublemma, and the definition of PASS.
\end_layout

\begin_layout Standard

\family typewriter
DOMAIN(Completer Loop)
\family default
 is 
\family typewriter
S[i].

\family default
 By the definition of 
\family typewriter
DOMAIN
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
0 <= n+1 <= SIZE(S[i] @ Bottom of Completer Loop(n) @ A)
\family default
.
 By the previous 2 grafs.
 [ 
\noun on
Current Set Size Graf
\noun default
.
 ]
\end_layout

\begin_layout Standard

\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ Bottom of Completer Loop(n).

\family default
 By the Induction Hypothesis.
 The assumption for the case is that f == 0.
\end_layout

\begin_layout Standard

\family typewriter
(S[i] @ A == S[i] @ B) @ Bottom of Completer Loop(n).

\family default
 By the previous, substituting A and B, and transposing the equality if
 necessary.
\end_layout

\begin_layout Standard

\family typewriter
0 < n+1 <= SIZE(S[i] @ Bottom of Completer Loop(n) @ B)
\family default
.
 By the previous graf and Current Set Size Graf.
\end_layout

\begin_layout Standard

\family typewriter
0 < n+1 <= SIZE(DOMAIN(Completer Loop) @ Bottom of Completer Loop(n) @ A))
\family default
.
 By the previous graf and the definition of 
\family typewriter
DOMAIN
\family default
.
 The domain of the Completer Loop is 
\family typewriter
S[i]
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
PASS(n+1, Completer Loop @ B)
\family default
.
 By the previous graf and the definition of PASS.
\end_layout

\begin_layout Standard

\noun on
This proves the subsublemma
\noun default
.
\end_layout

\begin_layout Standard

\family typewriter
PASS(n+1, Completer Loop @ AH2002) 
\begin_inset Formula $\Rightarrow$
\end_inset

 PASS(n+1, Completer Loop @ Marpa)
\family default
.
 By the sublemma, with 
\family typewriter
(A, B) := (AH2002, Marpa)
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
PASS(n+1, Completer Loop @ Marpa) 
\begin_inset Formula $\Rightarrow$
\end_inset

 PASS(n+1, Completer Loop @ AH2002)
\family default
.
 By the sublemma, with 
\family typewriter
(A, B) := (Marpa, AH2002)
\family default
.
\end_layout

\begin_layout Standard
Pass Equivalence Graf: 
\family typewriter
PASS(n+1, Completer Loop @ AH2002) 
\begin_inset Formula $\Leftrightarrow$
\end_inset

PASS(n+1, Completer Loop @ Marpa)
\family default
.
 By the previous two grafs.
\end_layout

\begin_layout Standard

\family typewriter
PASS(n+1, Completer Loop @ AH2002)
\family default
.
 By the assumption for Sublemma 1 and the Pass Equivalence Graf.
\end_layout

\begin_layout Standard

\family typewriter
PASS(n+1, Completer Loop @ Marpa)
\family default
.
 By the assumption for Sublemma 1 and the Pass Equivalence Graf.
\end_layout

\begin_layout Standard

\family typewriter
(S[i+f] @ AH2002 == S[i+f] @ Marpa) @ Bottom of Completer Loop(p)
\family default
.
 By the previous 2 grafs and the Earley Item Induction Property
\end_layout

\begin_layout Standard

\noun on
This proves the induction
\noun default
.
\end_layout

\begin_layout Standard

\noun on
This proves the sublemma
\noun default
.
\end_layout

\begin_layout Standard

\noun on
Note
\noun default
: The sublemma does most of the work of this Lemma.
 In statements that follow, I show 
\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa)
\family default
 holds as we proceed through the Earley Set Processing.
 To do this, I use the sublemma for the portion of the Earley Set Processing
 where 
\family typewriter
S[i]
\family default
 might change.
 Where 
\family typewriter
S[i]
\family default
 remains unchanged in both algorithms, this is established by referring
 to the pseudocode.
\end_layout

\begin_layout Standard
(
\family typewriter
S[i] @ AH2002 == S[i] @ Marpa) @ Start of Earley Set i Processing
\family default
.
 From the assumption in the Statement of the Lemma, substituting 
\family typewriter
S[i]
\family default
 for 
\family typewriter
S
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
((S[i] @ AH2002 == S[i] @ Marpa) @ Top of Completer Loop(0)) @ Earley Set
 i Processing
\family default
.
 From the pseudocode and the previous graf.
 [ Top Identical Graf.
 ]
\end_layout

\begin_layout Standard
Let 
\family typewriter
z
\family default
 be the pass such that 
\family typewriter
(AH2002 @ End of Completer Loop == AH2002 @ Bottom of Completer Loop(z))
 @ Earley Set i Processing
\family default
.
 By the AH2002 Halting Axiom there is such a pass.
\end_layout

\begin_layout Standard

\family typewriter
(S[i] @ AH2002 @ Bottom of Completer Loop(z) == S[i] @ Marpa @ Bottom of
 Completer Loop(z)) @ Earley Set i Processing
\family default
.
 By the previous two statements and sublemma.
 Pass 
\family typewriter
z
\family default
 exists for AH2002 by definition.
 That it exists for Marpa will be shown by the sublemma.
\end_layout

\begin_layout Standard

\family typewriter
(S[i] @ Marpa @ End of Completer Loop == S[i] @ Marpa @ Bottom of Completer
 Loop(z)) @ Earley Set i Processing
\family default
.
 By the last two statements and the loop definitions.
 If at pass z the Earley sets are identical, and the Earley set in one algorithm
 ends the loop, it must also end the loop in the other algorithm at pass
 z.
\end_layout

\begin_layout Standard

\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ End of Completer Loop @ Earley Set i Processin
g
\family default
.
 By the three statements just previous.
\end_layout

\begin_layout Standard

\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ End of Earley Set i Processing
\family default
.
 From the pseudocode.
\end_layout

\begin_layout Standard

\noun on
This proves the Lemma.
\end_layout

\begin_layout Subsection
The Scanner Lemma
\end_layout

\begin_layout Standard

\noun on
Statement of the Lemma
\noun default
:
\end_layout

\begin_layout Standard
Assume 
\family typewriter
PASS(p,
\family default
 
\family typewriter
Scanner Loop
\family default
).
\end_layout

\begin_layout Standard
Assume that 
\family typewriter
((state, parent) @ AH2002 == (state, parent) @ Marpa) @ Top of Scanner Loop
\family default
(p).
\end_layout

\begin_layout Standard
Assume that 
\family typewriter
(i @ AH2002 == i @ Marpa) @ Top of Scanner Loop(p)
\family default
.
\end_layout

\begin_layout Standard
Assume 
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Top of Scanner Loop(p)
\family default
.
\end_layout

\begin_layout Standard
Then 
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Bottom of Scanner Loop(p).
\end_layout

\begin_layout Standard

\noun on
Starting the proof
\noun default
:
\end_layout

\begin_layout Standard

\family typewriter
0 <= p <= SIZE(S[i] @ End of Scanner Loop)
\family default
.
 From as assumption for the lemma and the definition of 
\family typewriter
PASS
\family default
.
\end_layout

\begin_layout Standard
The proof proceeds in two cases -- the Final Case (
\family typewriter
p == SIZE(S[i] @ End of Scanner Loop)
\family default
) and the Kernel Case (
\family typewriter
0 <= p < SIZE(S[i] @ End of Scanner Loop)
\family default
).
\end_layout

\begin_layout Standard

\noun on
Kernel case
\noun default
:
\end_layout

\begin_layout Standard

\family typewriter
((state, parent, i) @ AH2002 == (state, parent, i) @ Marpa) @ Top of Scanner
 Loop(p)
\family default
.
 By assumption in the statement of the lemma.
\end_layout

\begin_layout Standard

\family typewriter
((state, parent, i) @ AH2002 == (state, parent, i) @ Marpa) @ Top of Scanner
 @ Scanner Loop(p)
\family default
.
 From the graf just previous and the psuedocode.
 These variables are never changed between 
\family typewriter
Top of Scanner Loop(p)
\family default
 and 
\family typewriter
Top of Scanner @ Scanner Loop(p)
\family default
.
 [ 
\noun on
Input Variables Graf ]
\end_layout

\begin_layout Standard

\family typewriter
G
\family default
 
\begin_inset Formula $\in$
\end_inset

 
\family typewriter
(x, GOTO)
\family default
 
\begin_inset Formula $\Rightarrow$
\end_inset

 
\family typewriter
(G @ Marpa == G @ AH2002) @ Top of Scanner @ Scanner Loop(p)
\family default
.
 By the Globals Lemmas.
 [ 
\noun on
Globals Graf ]
\end_layout

\begin_layout Standard

\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Top of Scanner Loop(p)
\family default
.
 By an assumption in the statement of the lemma.
\end_layout

\begin_layout Standard

\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Top of Scanner @ Scanner Loop(p)
\family default
.
 From the graf just previous and inspection of the pseudocode.
 This Earley set is never changed outside of Scanner.
 [ 
\noun on
Next Earley Set Graf ]
\end_layout

\begin_layout Standard

\family typewriter
S[i+1] @ AH2002 @ Bottom of Scanner @ Scanner Loop(p) == S[i+1] @ Marpa
 @ Bottom of Scanner @ Scanner Loop(p)
\family default
.
 From the pseudocode, the Input Variables Statement, the Globals Statement
 and Destination Statement.
 Inspection of Scanner shows that 
\family typewriter
S[i+1] @ End of Scanner
\family default
 is a deterministic function.
 Inputs of the function are 
\family typewriter
S[i+1]
\family default
 itself, the 
\family typewriter
Scanner
\family default
's input variables and the globals 
\family typewriter
x
\family default
 and 
\family typewriter
GOTO
\family default
.
 These inputs are the same in both algorithms by the Next Earley Set Graf,
 the Input Variables Graf and the Globals Graf.
 The 
\family typewriter
Scanner
\family default
 is also the same in both algorithms.
 Therefore the result must be the same.
\end_layout

\begin_layout Standard

\family typewriter
(S[i+1] @ AH2002 == S[i+f] @ Marpa) @ Bottom of Scanner Loop(p)
\family default
.
 From the pseudocode.
 This Earley set is never changed between 
\family typewriter
Bottom of Scanner @ Scanner Loop(p)
\family default
 and 
\family typewriter
Bottom of Scanner Loop(p)
\family default
.
\end_layout

\begin_layout Standard

\noun on
This proves the Kernel case
\noun default
.
\end_layout

\begin_layout Standard

\noun on
Final case
\noun default
:
\end_layout

\begin_layout Standard

\noun on
Assumption for this case
\noun default
: 
\family typewriter
FINAL(p, Scanner Loop).
\end_layout

\begin_layout Standard

\family typewriter
(S[i+1] @ Marpa == S[i+1] @ AH2002) @ Bottom of Scanner Loop(p)
\family default
.
 From an assumption in the statement of this lemma, the graf just previous
 and the loop definitions.
 The final pass of a loop is always a no-op.
\end_layout

\begin_layout Standard

\noun on
This proves the final case
\noun default
.
\end_layout

\begin_layout Standard
With the two cases proved, the Lemma is shown.
\end_layout

\begin_layout Standard
QED.
\end_layout

\begin_layout Subsection
The Scanner Loop Lemma
\end_layout

\begin_layout Standard

\noun on
Statement of the Lemma
\noun default
:
\end_layout

\begin_layout Standard
Assume that 
\family typewriter
PASS(p,
\family default
 
\family typewriter
Scanner Loop @ Earley Set i Processing
\family default
).
\end_layout

\begin_layout Standard
Assume that 
\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Earley Set i Processing
\family default
.
\end_layout

\begin_layout Standard
Assume 
\family typewriter
(i @ Marpa == i @ AH2002) @ Earley Set Processing
\family default
.
\end_layout

\begin_layout Standard
Assume 
\family typewriter
(S[i+1] @ Marpa == S[i+1] @ AH2002) @ Top of Scanner Loop(p)
\family default
.
\end_layout

\begin_layout Standard
Then 
\family typewriter
(S[i+1] @ Marpa == S[i+1] @ AH2002) @ Bottom of Scanner Loop (p)
\family default
.
\end_layout

\begin_layout Standard

\noun on
Proof:
\end_layout

\begin_layout Standard

\family typewriter
(S[i] @ AH2002 == S[i] @ Marpa) @ End of Earley Set i Processing
\family default
.
 By an assumption for the lemma, with S[i] substituted for S.
\end_layout

\begin_layout Standard

\family typewriter
(S[i][p] @ AH2002 == S[i][p] @ Marpa) @ End of Earley Set i Processing
\family default
.
 From the statement just previous.
 p exists in both algorithms because S[i] is the domain of Scanner Loop,
 and by assumption for the lemma, p is a pass.
\end_layout

\begin_layout Standard

\family typewriter
End of Scanner Loop @ Earley Set i Processing == End of Earley Set i Processing
\family default
.
 From the pseudocode.
 Recall that 
\family typewriter
Combined Loop @ AH2002 == Scanner Loop @ AH2002
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
(S[i][p] @ AH2002 == S[i][p] @ Marpa) @ End of Scanner Loop @ Earley Set
 i Processing
\family default
.
 From the previous 2 grafs.
\end_layout

\begin_layout Standard

\family typewriter
(S[i][p] @ Top of Scanner Loop(p) == S[i][p] @ End of Scanner Loop) @ Earley
 Set i Processing
\family default
.
 S[i][p] exists at 
\family typewriter
Top of Scanner Loop(p)
\family default
 by the definition of PASS.
 It is identical to S[i][p] at End of Scanner Loop because worklist items,
 once they exist, are never changed or deleted.
\end_layout

\begin_layout Standard

\family typewriter
(S[i][p] @ AH2002 == S[i][p] @ Marpa) @ Top of Scanner Loop(p) @ Earley
 Set i Processing
\family default
.
 From the previous 2 grafs.
\end_layout

\begin_layout Standard

\family typewriter
((state, parent) == S[i][p]) @ Top of Scanner Loop(p) @ Earley Set i Processing
\family default
.
 Because (state, parent) and S[i][p] are two different names for the loop
 variable, they must be identical.
\end_layout

\begin_layout Standard

\family typewriter
((state, parent) @ AH2002 == (state, parent) @ Marpa) @ Top of Scanner Loop(p)
 @ Earley Set i Processing
\family default
.
 From the previous two statements.
\end_layout

\begin_layout Standard

\family typewriter
(i @ AH2002 == i @ Marpa) @ Top of Scanner Loop (p)
\family default
.
 From an assumption in the statement of this lemma.
\end_layout

\begin_layout Standard

\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Top of Scanner Loop(p)
\family default
.
 An assumption in the statement of this lemma.
\end_layout

\begin_layout Standard

\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Bottom of Scanner Loop(p)
\family default
.
 By the Scanner Lemma with p set to its value in this Lemma, and by the
 three statements just previous.
\end_layout

\begin_layout Standard

\noun on
This proves the lemma.
\end_layout

\begin_layout Subsection
The Next Earley Set Lemma
\end_layout

\begin_layout Standard

\noun on
Statement of the Lemma
\noun default
:
\end_layout

\begin_layout Standard
Assume 
\family typewriter
SIZE(S[i] @ AH2002) == SIZE(S[i] @ Marpa) @ End of Earley Set i Processing
\family default
.
\end_layout

\begin_layout Standard
Assume 
\family typewriter
0 <= i <= SIZE(x)
\family default
.
\end_layout

\begin_layout Standard
Assume 
\family typewriter
(S @ AH2002 == S @ Marpa) @ Start of Earley Set i Processing
\family default
.
\end_layout

\begin_layout Standard
Then, 
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ End of Earley Set i Processing
\family default
.
\end_layout

\begin_layout Standard

\noun on
Starting the Proof
\noun default
:
\end_layout

\begin_layout Standard

\noun on
Sublemma
\noun default
:
\end_layout

\begin_layout Standard
Assume 
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Top of Scanner Loop(0)
\family default
.
\end_layout

\begin_layout Standard
Assume 
\family typewriter
p <= SIZE(S[i+f]) @ AH2002 @ Bottom of Scanner Loop(p)
\family default
 or 
\family typewriter
p <= SIZE(S[i+f]) @ Marpa @ Bottom of Scanner Loop(p)
\family default
.
\end_layout

\begin_layout Standard
Then 
\family typewriter
(S[i+f] @ AH2002 == S[i+f] @ Marpa) @ Bottom of Scanner Loop(p)
\family default
.
\end_layout

\begin_layout Standard

\noun on
Note
\noun default
: In this sublemma, it is assumed in the statement of the sublemma that
 pass p exists for one algorithm, but not for the other.
 Loop operations are not defined for non-existent passes, and it must be
 proved that pass p exists for both AH2002 and Marpa.
\end_layout

\begin_layout Standard

\noun on
Starting the proof
\noun default
:
\end_layout

\begin_layout Standard

\noun on
Induction Property
\noun default
: 
\family typewriter
(S[i+f] @ A == S[i+f] @ Marpa) @ Bottom of Scanner Loop(p)
\family default
, where the induction is on 
\family typewriter
p
\family default
.
\end_layout

\begin_layout Standard

\noun on
Induction
\noun default
 
\noun on
Basis
\noun default
: 
\family typewriter
(S[i+f] @ A == S[i+f] @ Marpa) @ Bottom of Scanner Loop(0)
\family default
.
 By the assumptions in the statement of Sublemma 1 and the Earley Item Induction
 Property Lemma with p == 0.
 Pass 0 always exists for every loop, by definition.
\end_layout

\begin_layout Standard

\noun on
This proves the induction basis.
\end_layout

\begin_layout Standard

\noun on
Induction hypothesis
\noun default
:
\end_layout

\begin_layout Standard
Assume 
\family typewriter
(S[i+f] @ AH2002 == S[i+f] @ Marpa) @ Bottom of Scanner Loop(n)
\family default
.
\end_layout

\begin_layout Standard

\noun on
Inductive step
\noun default
: 
\end_layout

\begin_layout Standard
Note: Next it must be shown that 
\family typewriter
n+1
\family default
 is a pass in both 
\family typewriter
AH2002
\family default
 and 
\family typewriter
Marpa
\family default
.
 
\end_layout

\begin_layout Standard

\noun on
Subsublemma
\noun default
: Let 
\family typewriter
(A,B)
\family default
 be either 
\family typewriter
(AH2002, Marpa) 
\family default
or 
\family typewriter
(Marpa, AH2002)
\family default
.
\end_layout

\begin_layout Standard
Assume 
\family typewriter
PASS(n+1, Scanner Loop @ A)
\family default
.
\end_layout

\begin_layout Standard
Then 
\family typewriter
PASS(n+1, Scanner Loop @ B)
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
SIZE(S[i] @ AH2002) == SIZE(S[i] @ Marpa) @ End of Earley Set i Processing
\family default
.
 From an assumption for this lemma.
\end_layout

\begin_layout Standard

\family typewriter
(S[i] @ A == S[i] @ B) @ End of Earley Set i Processing.

\family default
 By the previous, substituting A and B, and transposing the equality if
 necessary.
\end_layout

\begin_layout Standard

\family typewriter
(S[i] @ A == S[i] @ B) @ End of Scanner Loop.

\family default
 From the previous graf and the pseudocode.
 Nothing between 
\family typewriter
End of Scanner Loop
\family default
 and 
\family typewriter
End of Earley Set i Processing
\family default
 changes 
\family typewriter
S[i].
 [ 
\family default
\noun on
End of Scanner Loop Equality Graf
\noun default
 ]
\end_layout

\begin_layout Standard

\family typewriter
0 <= n < SIZE(DOMAIN(Scanner Loop) @ Bottom of Scanner Loop(n) @ A))
\family default
.
 By assumption for the sublemma and the definition of 
\family typewriter
PASS
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
0 <= n+1 <= SIZE(S[i] @ Bottom of Scanner Loop(n) @ A)
\family default
.
 By the previous graf, the loop definitions.
 By the definition of 
\family typewriter
DOMAIN
\family default
, 
\family typewriter
DOMAIN(Scanner Loop) == S[i]
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
SIZE(S[i] @ Bottom of Scanner Loop(n) @ A) <= SIZE(S[i] @ End of Scanner
 Loop @ A)
\family default
.
 By the loop definitions.
 End of Scanner Loop is at or after Bottom of Scanner Loop(n) and the loop
 domain never decreases in size.
\end_layout

\begin_layout Standard

\family typewriter
0 <= n+1 <= SIZE(S[i] @ End of Scanner Loop @ A)
\family default
.
 From the previous 2 grafs.
\end_layout

\begin_layout Standard

\family typewriter
0 <= n+1 <= SIZE(S[i] @ End of Scanner Loop @ B)
\family default
.
 From the previous and the 
\family typewriter
End of Scanner Loop Equality Graf
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
PASS(n+1, Scanner Loop @ B)
\family default
.
 From the previous graf and definition of 
\family typewriter
PASS
\family default
.
\end_layout

\begin_layout Standard

\noun on
This proves the subsublemma
\noun default
.
\end_layout

\begin_layout Standard

\family typewriter
PASS(n+1, Scanner Loop @ AH2002) 
\begin_inset Formula $\Rightarrow$
\end_inset

 PASS(n+1, Scanner Loop @ Marpa)
\family default
.
 By the subsublemma, with 
\family typewriter
(A, B) := (AH2002, Marpa)
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
PASS(n+1, Scanner Loop @ Marpa) 
\begin_inset Formula $\Rightarrow$
\end_inset

 PASS(n+1, Scanner Loop @ AH2002)
\family default
.
 By the subsublemma, with 
\family typewriter
(A, B) := (Marpa, AH2002)
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
PASS(n+1, Scanner Loop @ AH2002) 
\begin_inset Formula $\Leftrightarrow$
\end_inset

PASS(n+1, Scanner Loop @ Marpa)
\family default
.
 By the previous two grafs.
 [ 
\noun on
Pass Equivalence Graf
\noun default
 ]
\end_layout

\begin_layout Standard

\family typewriter
PASS(n+1, Scanner Loop @ AH2002)
\family default
.
 By the assumption for the sublemma and the Pass Equivalence Graf.
\end_layout

\begin_layout Standard

\family typewriter
PASS(n+1, Scanner Loop @ Marpa)
\family default
.
 By the assumption for the sublemma and the Pass Equivalence Graf.
\end_layout

\begin_layout Standard

\family typewriter
(S[i+f] @ AH2002 == S[i+f] @ Marpa) @ Top of Scanner Loop(p+1)
\family default
.
 By the loop definitions and the induction hypothesis.
 The two grafs just previous show that pass 
\family typewriter
n+1
\family default
 exists for both algorithms.
\end_layout

\begin_layout Standard

\family typewriter
(S[i+f] @ AH2002 == S[i+f] @ Marpa) @ Bottom of Scanner Loop(p+1)
\family default
.
 By the previous graf and the Scanner Loop Lemma.
\end_layout

\begin_layout Standard

\noun on
This proves the induction
\noun default
.
\end_layout

\begin_layout Standard

\noun on
This proves sublemma
\noun default
.
\end_layout

\begin_layout Standard

\noun on
Note
\noun default
: The sublemma does most of the work of this lemma.
 In statements that follow, I show 
\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa)
\family default
 holds as we proceed through the 
\family typewriter
Earley Set Processing
\family default
.
 To do this, I use the sublemma for the portion of the 
\family typewriter
Earley Set Processing
\family default
 where 
\family typewriter
S[i+1]
\family default
 might change.
 In most cases 
\family typewriter
S[i+1] 
\family default
remains unchanged in both algorithms, as is clear from the pseudocode.
\end_layout

\begin_layout Standard

\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Start of Earley Set i Processing
\family default
.
 From the assumption in the Statement of the Lemma, substituting 
\family typewriter
S[i+1]
\family default
 for 
\family typewriter
S
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
((S[i+1] @ AH2002 == S[i+1] @ Marpa) @ Top of Scanner Loop(0)) @ Earley
 Set i Processing
\family default
.
 From the pseudocode and the previous graf.
 [ 
\noun on
Top Identical Graf
\noun default
 ]
\end_layout

\begin_layout Standard
Let 
\family typewriter
z
\family default
 be the pass such that 
\family typewriter
(AH2002 @ End of Scanner Loop == AH2002 @ Bottom of Scanner Loop(z)) @ Earley
 Set i Processing
\family default
.
 By the AH2002 Halting Axiom there is such a pass.
\end_layout

\begin_layout Standard

\family typewriter
(S[i+1] @ AH2002 == S[i+f] @ Marpa) @ Bottom of Scanner Loop(z) @ Earley
 Set i Processing
\family default
.
 By the previous two statements and sublemma.
 Pass 
\family typewriter
z
\family default
 exists for 
\family typewriter
AH2002
\family default
 by definition.
 That it exists for 
\family typewriter
Marpa
\family default
 will be shown by the sublemma.
\end_layout

\begin_layout Standard

\family typewriter
(S[i+1] @ Marpa @ End of Scanner Loop == S[i+1] @ Marpa @ Bottom of Scanner
 Loop(z)) @ Earley Set i Processing
\family default
.
 By the last two statements and the loop definitions.
 If at pass 
\family typewriter
z
\family default
 the Earley sets are identical, and the Earley set in one algorithm ends
 the loop, it must also end the loop in the other algorithm at pass 
\family typewriter
z
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ End of Scanner Loop @ Earley Set i
 Processing
\family default
.
 By the three statements just previous.
\end_layout

\begin_layout Standard

\family typewriter
(S[i+1] @ AH2002 == S[i+1] @ Marpa) @ End of Earley Set i Processing
\family default
.
 From the pseudocode.
\end_layout

\begin_layout Standard

\noun on
This proves the Lemma.
\end_layout

\begin_layout Subsection
Earley Set Processing Lemma
\end_layout

\begin_layout Standard

\noun on
Statement of the Lemma
\noun default
:
\end_layout

\begin_layout Standard
Assume for 
\family typewriter
(S @ Marpa == S @ AH2002) @ Start of Earley Set n Processing
\family default
.
\end_layout

\begin_layout Standard
Assume also, that 
\family typewriter
0 < n < SIZE(x)
\family default
.
\end_layout

\begin_layout Standard
Then 
\family typewriter
(S @ Marpa == S @ AH2002) @ End of Earley Set n Processing
\family default
.
\end_layout

\begin_layout Standard

\noun on
Starting the Proof
\noun default
:
\end_layout

\begin_layout Standard
For all 
\family typewriter
x
\family default
, 
\family typewriter
(x == n or x == n+1 or S[x] @ Marpa == S[x] @ AH2002) @ End of Earley Set
 n Processing
\family default
.
 From the Globals Lemma, and the assumption in the Statement of the Lemma.
\end_layout

\begin_layout Standard

\family typewriter
(S[n] @ Marpa == S[n] @ AH2002) @ End of Earley Set n Processing
\family default
.
 From the Current Earley Set Lemma 
\family typewriter
(i := n)
\family default
, and the assumptions in the Statement of the Lemma.
\end_layout

\begin_layout Standard

\family typewriter
(S[n+1] @ Marpa == S[n+1] @ AH2002) @ End of Earley Set n Processing
\family default
.
 From the Next Earley Set Lemma 
\family typewriter
(i := n)
\family default
, the previous graf, and the assumptions in the Statement of the Lemma.
\end_layout

\begin_layout Standard
For all x, 
\family typewriter
(S[x] @ Marpa == S[x] @ AH2002) @ End of Earley Set n Processing
\family default
.
 Combining the three statements just previous.
\end_layout

\begin_layout Standard

\family typewriter
(S @ Marpa == S @ AH2002) @ End of Earley Set n Processing
\family default
.
 From the previous statement.
\end_layout

\begin_layout Standard
QED.
\end_layout

\begin_layout Subsection
AH2002 and Marpa are Equivalent
\end_layout

\begin_layout Standard
We regard AH2002 and Marpa as equivalent if, given the Globals Assumption,
 the Earley sets are identical at the end of each algorithm.
 The Globals Assumption has been assumed throughout this proof.
\end_layout

\begin_layout Standard

\noun on
Statement of the Theorem
\noun default
: 
\family typewriter
S @ End of AH2002 == S @ End of Marpa
\family default
.
\end_layout

\begin_layout Standard

\noun on
lemma
\noun default
: Assume 
\family typewriter
0 < s < SIZE(x)
\family default
.
\end_layout

\begin_layout Standard
Then 
\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer Step s
\family default
.
\end_layout

\begin_layout Standard

\noun on
Starting the Proof
\noun default
: The proof is by induction on the Recognizer Steps.
\end_layout

\begin_layout Standard

\noun on
Induction Property
\noun default
: 
\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer Step t
\family default
, where the induction is on 
\family typewriter
t
\family default
.
\end_layout

\begin_layout Standard

\noun on
Induction Base
\noun default
:
\end_layout

\begin_layout Standard

\family typewriter
(S @ AH2002 == S @ Marpa) @ Start of Recognizer Step 0
\family default
.
 By the Globals Lemmas.
\end_layout

\begin_layout Standard

\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer Step 0
\family default
.
 By the just previous statement, the Globals Lemmas and the Earley Sets
 Processing Lemma with 
\family typewriter
n := 0
\family default
.
 There will always be an Earley set 0 and therefore by the definition of
 the Recognizer, a Recognizer Step 0.
\end_layout

\begin_layout Standard

\noun on
Induction Hypothesis
\noun default
:
\end_layout

\begin_layout Standard

\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer Step t
\family default
.
\end_layout

\begin_layout Standard

\noun on
Inductive Step
\noun default
:
\end_layout

\begin_layout Standard

\family typewriter
0 <= t+1 < SIZE(x)
\family default
.
 From the statement of the Sublemma.
 [ 
\noun on
Induction Limit Graf ]
\end_layout

\begin_layout Standard

\family typewriter
(S @ AH2002 == S @ Marpa) @ Start of Recognizer Step t+
\family default
1.
 By the definition of the Recognizer, the Induction Hypothesis and the Induction
 Limit Graf.
\end_layout

\begin_layout Standard

\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer Step t+1
\family default
.
 By the just previous statement, the Induction Limit Graf, and the Earley
 Set Processing Lemma with 
\family typewriter
(n := s+1)
\family default
.
\end_layout

\begin_layout Standard

\noun on
This completes the induction
\noun default
.
\end_layout

\begin_layout Standard

\noun on
This proves the lemma
\noun default
.
\end_layout

\begin_layout Standard

\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer Loop(SIZE(x) - 1)
\family default
.
 By the lemma with 
\family typewriter
s := SIZE(x)-1
\family default
.
\end_layout

\begin_layout Standard

\family typewriter
(S @ AH2002 == S @ Marpa) @ Start of Recognizer Loop(SIZE(x))
\family default
.
 By the graf just previous and the loop definitions.
 SIZE(x) is a pass by the definition of the Recognizer.
\end_layout

\begin_layout Standard

\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer Loop(SIZE(x))
\family default
.
 By the loop definitions and the previous graf.
 SIZE(x) is the final pass of Recognizer Loop, by its definition.
 The final pass is a no-op, so the values of S will not change in either
 algorithm.
\end_layout

\begin_layout Standard

\family typewriter
(S @ AH2002 == S @ Marpa) @ End of Recognizer
\family default
.
 By the definition of the Recognizer and the previous graf.
\end_layout

\begin_layout Standard

\family typewriter
S @ End of AH2002 == S @ End of Marpa
\family default
.
 By the graf just previous and by the definition of the two algorithms.
\end_layout

\begin_layout Standard

\noun on
QED
\noun default
.
\end_layout

\end_body
\end_document