Not logged in

Overview
Comment: Penultimate draft of the report. After spell checking and such, this will become the final draft. Tarball | ZIP archive | SQL archive family | ancestors | descendants | both | trunk files | file ages | folders 98924328add2f2586fae104b095ae6bdbe806d8f andy 2015-04-18 19:48:55
Context
 2015-04-18 21:14 Final draft of report. (Unless I think of something else to add...) check-in: 04f1cd0696 user: andy tags: trunk 19:48 Penultimate draft of the report. After spell checking and such, this will become the final draft. check-in: 98924328ad user: andy tags: trunk 18:47 Added a TODO comment describing the problems around applying the substitution to the body of an expanded clause, instead of appending them as unification goals. check-in: b5c5523821 user: andy tags: trunk
Changes

Changes to report/arend-report.bib.

    48     48     author={Page, Rex and Eastlund, Carl and Felleisen, Matthias},
49     49     booktitle={Proceedings of the 2008 international workshop on Functional and declarative programming in education},
50     50     pages={21--30},
51     51     year={2008},
52     52     organization={ACM}
53     53   }
54     54
55  +
56  +@inproceedings{Ford:2004:PEG:964001.964011,
57  + author = {Ford, Bryan},
58  + title = {Parsing Expression Grammars: A Recognition-based Syntactic Foundation},
59  + booktitle = {Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages},
60  + series = {POPL '04},
61  + year = {2004},
62  + isbn = {1-58113-729-X},
63  + location = {Venice, Italy},
64  + pages = {111--122},
65  + numpages = {12},
66  + url = {http://doi.acm.org/10.1145/964001.964011},
67  + doi = {10.1145/964001.964011},
68  + acmid = {964011},
69  + publisher = {ACM},
70  + address = {New York, NY, USA},
71  + keywords = {BNF, GTDPL, TDPL, context-free grammars, lexical analysis, packrat parsing, parsing expression grammars, regular expressions, scannerless parsing, syntactic predicates, unified grammars},
72  +}
55     73
56     74   @inproceedings{reichelt2010types,
57     75     title={Treating Sets as Types in a Proof Assistant for Ordinary Mathematics},
58     76     author={Reichelt, Sebastian},
59     77     date={September 4, 2010},
60     78     organization={Institute of Informatics, University of Warsaw, Warszawa, Poland},
61     79     url={http://hlm.sourceforge.net/types.pdf},



Changes to report/arend-report.pdf.

cannot compute difference between binary files



Changes to report/arend-report.tex.

    69     69   % All premises/conclusions in inference rules should be in math mode
70     70   \renewcommand{\predicate}[1]{$#1$}
71     71
72     72   % Setup fonts.
73     73   \setmainfont[Mapping=tex-text,Numbers=OldStyle]{Junicode}
74     74   \setsansfont[Mapping=tex-text,Numbers=OldStyle,Scale=MatchLowercase]{Gill Sans MT}
75     75   \setmonofont[Mapping=tex-text,Scale=MatchLowercase]{Fantasque Sans Mono}
76  +
77  +% The \lnum{} command displays numbers normally (not old style). The font
78  +% setting above sets them to old-style by default, and you cannot "turn off"
79  +% a font feature, only create a new font with the feature turned off.
80  +\newfontface\liningnum{Junicode}
81  +\newcommand{\lnum}[1]{{\liningnum #1}}
76     82
77     83   % Hyperlinks. We might want these inline, or in the sidebar.
79     85
80     86   % Citations. Both in the text and in the sidebar. The NoHyper makes the actual
81     87   % citation link to the bibliography at the end, rather than to the bibentry
82     88   % in the margin. It also has the unfortunate side effect of making any URLs
................................................................................
308    314       proof-trees, as the traditional paragraph-proof notation tends to obscure
309    315       the fact that a proof is composed of subproofs.
310    316
311    317   \item Because proofs are hierarchical, their construction is very similar to
312    318       that of other kinds of programming. A proof is iteratively broken down into
313    319       smaller parts, lemmas can be used, like functions, to abstract out
314    320       commonly-used subproofs, or to break down proofs that would otherwise be
315         -    too large and unweildy to understand.
321  +    too large and unwieldy to understand.
316    322
317    323   \item The lack of traditional negation eliminates some of the more-confusing
318    324       elements of traditional logic programming. In particular, Arend has nothing
319    325       corresponding to negation as failure''; it is not possible in Arend to
320    326       know, in a computational sense, when a proposition $P$ fails. Thus reasoning
321    327       in Arend is entirely about things that are known to be true, never about
322    328       things that are assumed to be false.
................................................................................
385    391   \item \emph{Proof assistants} aim to aid in the development and checking of
386    392       formal proofs. While some systems only check proofs for completeness, more
387    393       modern systems will typically also aid the user in the development of a
388    394       proof. Recent examples
389    395       include Coq \mcitep{Coq:manual}, Twelf \mcitep{pfenning1999system}, and
390    396       Abella \mcitep{gacek08ijcar} Several element of Arend's design
391    397       were influenced by Abella, most notably the use of a constructive two-level
392         -    logic for describing and reasoning about systems.
398  +    logic for describing and reasoning about systems. For a full history of the
399  +    development of computer-assisted proof systems, see
400  +    \mcitet{geuvers2009proof}.
393    401
394    402   \item More recently, several systems have emerged which try to bridge the gap
395    403       between traditional functional programming and theorem proving. Systems
396    404       such as Agda \mcitep{norell:thesis}, Idris \mcitep{brady2011idris},
397    405       and Beluga \mcitep{pientka2010beluga} are based on the
398    406       Curry-Howard isomorphism (see below) and thus represent propositions as
399    407       types, and proofs as functional programs. These systems offer varying
................................................................................
449    457       support for visualizing proof structure, its consistency, and in the
450    458       degree of assistance'' it gives to the user.
451    459
452    460   \item \mcitet{suppes1981future} offers an interesting perspective, that of an
453    461       instructor of mathematics, not computer science, at Stanford University,
454    462       and furthermore, one who had, at the time of this publication, been using
455    463       interactive theorem proving in undergraduate coursework for almost twenty
456         -    years. Suppes raises several points of concern which are still revelant,
457         -    most notably the conflict between providing a system which is intelligent''
464  +    years. Suppes raises several points of concern which are still relevant,
465  +    most notably the conflict between providing a system which is
466  +    intelligent''
458    467       and, hence, easy to use, and one which meets the pedagogical goals of
459    468       teaching \emph{introductory} material (the sort of material which a more
460    469       intelligent proof assistant would tend to elide). Thus, although Arend is
461    470       actually capable of automatically proving a reasonable large set of
462    471       propositions, during proof construction it intentionally refrains from
463    472       doing so, thus forcing the user to carry out --- and, hopefully, to
464    473       learn --- the elementary steps of proof construction.
................................................................................
529    538   \inference[Assume-A]
530    539       {}
531    540       {A}
532    541   \]
533    542   However, this assumption only applies within the scope'' of the subproof of
534    543   $A -> B$.
535    544
536         -In order to express this, we introduce the \emph{hypothetical judgements} using
545  +In order to express this, we introduce the \emph{hypothetical judgments} using
537    546   the symbol $|-$.
538         -$A_1, A_2, \ldots A_n |- C$ should be read $C$ is true \emph{assuming} $A_1$, etc.'' We call the $A_i$ the \emph{antecedants} of the judgement, and
547  +$A_1, A_2, \ldots A_n |- C$ should be read $C$ is true \emph{assuming} $A_1$, etc.'' We call the $A_i$ the \emph{antecedents} of the judgment, and
539    548   $C$ the \emph{consequent}. (Although not strictly correct, we will sometimes
540         -refer to the antecedants and consequent of a \emph{rule}, since a rule can have
541         -at most one hypothetical judgement as its conclusion.)
549  +refer to the antecedents and consequent of a \emph{rule}, since a rule can have
550  +at most one hypothetical judgment as its conclusion.)
542    551   \footnote{Note that $|-$ is \emph{not} a logical connective; in particular, it
543    552   is not valid to say, e.g., $A \land (B |- C)$.} We use $\Gamma$ to signify any
544    553   collection of $P_i$. To enable the use of $|-$ in
545    554   our derivations we add the following axiom, known as the \emph{assumption rule}:
546    555   $547 556 \inference[Assume] 548 557 {} ................................................................................ 782 791 derivations produced by queries are a restricted form of the derivations 783 792 constructed as proofs. Thus, queries against the specification can serve as an 784 793 introduction to the creation of simple proofs. 785 794 786 795 Since Arend's specification logic is a restricted form of the Horn-clause 787 796 classical logic 788 797 used in traditional Prolog, a resolution-style proof search procedure is 789 -sufficient to execute queries against a specification. The operational semantics 790 -of this procedure are presented, in an abbreviated form, in 791 -\hyperref[spec-execute]{Appendix~1}. 798 +sufficient to execute queries against a specification. 792 799 Note that in keeping with its intuitionistic foundation, Arend has no negation 793 800 operator. (Intuitionistically, \neg P can be defined as P \rightarrow \bot, 794 801 but Arend's specification logic lacks the rightward \rightarrow implication 795 802 operator. This operator \emph{is} present in the reasoning logic, which thus 796 803 requires a more sophisticated handling.) 804 +% spec-execute -- Mention operational semantics here. 797 805 798 806 \subsection{Reasoning logic} 799 807 800 808 \newthought{Arend's reasoning logic} is closer to a full 801 809 expression of first-order 802 810 intuitionistic logic, with extensions to support proofs by single-induction. 803 811 The inference rules defining the reasoning logic are 804 812 presented in Figure \ref{reasoning-rules}. It supports rightward implication 805 813 (\rightarrow), with the restriction that the premise cannot contain nested 806 814 implications, only conjunctions, disjunctions, and atomic 807 815 goals.\footnote{This restriction subsumes the \emph{stratification} 808 816 restriction in Abella; stratification in some form is required to ensure 809 -monotonicity of the logic.} This implies that antecedants cannot contain 817 +monotonicity of the logic.} This implies that antecedents cannot contain 810 818 implications, since they cannot be added by \rightarrow_R, and are not allowed 811 819 in definitions in the specification logic. 812 820 813 821 %%% ------------------- Reasoning logic rules -------------------- 814 822 \begin{figure}[!p] 815 823 816 824 % Assumption rule ................................................................................ 855 863 % Implication rule 856 864 \[ 857 865 \inference[{->}_R] 858 866 {\Gamma,P |- Q} 859 867 {\Gamma |- P -> Q} 860 868 \qquad\qquad 861 869 \text{\small\parbox{5cm}{(There is no {->}_L rule, because implications are 862 -not allowed in antecedants.)}} 870 +not allowed in antecedents.)}} 863 871$
864    872
865    873   \vspace{1em}
866    874
867    875   % Forall rules
868    876   $869 877 \inference[\forall_R] ................................................................................ 944 952 derivation (because this query has no free variables, no substitution is 945 953 displayed). The rules of the specification (the \mathbb{N} specification, 946 954 given in full in \hyperref[sec:nat-spec]{appendix II}) are displayed below 947 955 the input line. 948 956 949 957 \item The interactive \emph{proof assistant} allows the user to construct 950 958 proofs for given propositions, in the context of a particular 951 - specification. 959 + specification. Figure~\ref{fig:passist1} illustrates the proof assistant 960 + interface: the rules of the current specification are displayed in the 961 + pane on the left; since the current proof is inductive, the inductive 962 + hypothesis is included. (Lemmas, once proved, are also displayed as 963 + rules.) 964 + 965 + The right pane displays the current proof statement, and the proof, 966 + here, in progress. The user can double-click on any antecedent, or any goal, 967 + to perform the default elaboration for that element. For example, 968 + double-clicking an antecedent P \land Q would apply the \land-L rule, 969 + producing a subproof with antecedents P,Q. For situations in which 970 + there is more than one possible action (e.g., the \lor-R rules, or when 971 + backchaining against the IH), keyboard interaction is used to select the 972 + appropriate action. 952 973 953 974 \end{itemize} 954 975 955 976 956 977 \begin{figure*}[ht!] 957 978 958 979 \begin{centering} ................................................................................ 960 981 \end{centering} 961 982 962 983 \caption{The browser-based run-eval-print-loop interface} 963 984 \label{fig:repl} 964 985 \end{figure*} 965 986 966 987 967 -\begin{figure*}[ht!] 988 +\begin{figure*}[t!] 968 989 969 990 \begin{centering} 970 -\includegraphics[width=6.5in]{proof-start-sample.png} 991 +\includegraphics[width=6.5in]{proof-sample.png} 971 992 \end{centering} 972 993 973 994 \caption{The proof assistant interface, with an incomplete inductive proof} 974 995 \label{fig:passist1} 975 996 \end{figure*} 976 997 977 998 978 999 \clearpage 979 1000 \section{Implementation} 980 1001 981 -\newthought{Arend currently exists} in two semi-overlapping implementations: 1002 +\newthought{Arend is implemented as} a web-based system, with a server 1003 +component, written in Prolog and running in the 1004 +\link{http://swi-prolog.org}{SWI-Prolog} and a browser-based frontend. 1005 +Currently, the implementation of Arend consists of 1006 + 1007 +\begin{itemize} 1008 +\item \lnum{1,401} lines of Prolog 1009 +\item \lnum{6,198} lines of Javascript (of which \lnum{442} lines are test 1010 + code) 1011 +\item \lnum{493} lines of PEG grammar specification 1012 +\item \lnum{501} lines of HTML 1013 +\item \lnum{129} lines of CSS 1014 +\item \lnum{41} source code files in total 1015 +\end{itemize} 1016 + 1017 +The following libraries and applications are used in the development of Arend; 1018 +these will be described in detail in the following section. 982 1019 \begin{itemize} 983 -\item A \emph{client-based} implementation, written in JavaScript and running 984 -largely in the browser. (The core JavaScript proof-checker can also run 985 -offline, via a JavaScript engine such as \link{https://nodejs.org/}{node.js} or 986 -\link{https://iojs.org/en/index.html}{io.js}.) This implementation is currently 987 -incomplete. 988 -\item A \emph{server-based} implementation written in Prolog, running on top of 989 -the \link{http://swi-prolog.org}{SWI-Prolog} HTTP server. 1020 +\item \link{https://nodejs.org/}{Node.js} 1021 +\item \link{http://www.swi-prolog.org/}{SWI-Prolog} 1022 +\item \link{http://pengines.swi-prolog.org/docs/index.html}{Pengines} 1023 +\item \link{https://jquery.com/}{jQuery} --- General utilities for 1024 + Javascript in a browser environment 1025 +\item \link{https://lodash.com/}{Lodash} --- Collection utilities for Javascript 1026 +\item \link{http://pegjs.org/}{PEG.js} --- Parsing Expression Grammar parser 1027 + generator for Javascript. 1028 +\item \link{https://qunitjs.com/}{QUnit} --- Javascript test framework 1029 +\item \link{http://www.jscheck.org/}{JSCheck} --- Randomized testing 1030 + engine for Javascript. 990 1031 \end{itemize} 991 1032 992 -Both implementations share some front-end code, mostly related to the 993 -representation of terms as JavaScript objects, and to the rendering of rules and 994 -proofs as HTML. 995 - 996 -Details... 997 - 998 -\subsection{Client-based implementation (JavaScript)} 999 - 1000 -Lines of code (UI vs. checker), number of test cases, portability layer for 1001 -browser/offline compatibility, etc. 1002 - 1003 -Libraries used: 1033 +Arend's development is tracked using the \link{http://fossil-scm.org}{Fossil} 1034 +source control management system. As of \lnum{April 18, 2015,} the project 1035 +history consisted of \lnum{294} commits spanning eight months of development. 1036 +Arend, its source code and project history, can be found on the web at 1037 +\url{http://fossil.twicetwo.com/arend.pl}. 1038 + 1039 +\subsection{Automated testing} 1040 +Arend's Javascript code is run through a test suite consisting of \lnum{133} 1041 +automated tests, however, of these, \lnum{12} central tests are randomized, 1042 +running, by default, \lnum{20} randomly generated tests each. Thus, a total 1043 +of \lnum{361} individual tests are run. The test suite can be run in the 1044 +browser, or offline, via Node.js. Testing is handled via the 1045 +\link{https://qunitjs.com/}{QUnit} test framework; a small compatibility 1046 +layer was written to allow QUnit tests to run offline. Randomized testing is 1047 +provided by \link{http://www.jscheck.org/}{JSCheck} a port'' of the 1048 +Haskell QuickCheck framework to Javascript. We have extended JSCheck with 1049 +support for randomized generation of Arend data types: atoms, variables, 1050 +and ground and non-ground terms up to a limited depth. 1051 + 1052 +QUnit is designed for automated testing in a browser environment. Arend's 1053 +development, as far as is possible, tries to target both browser and offline 1054 +environments; this is true even for Javascript code. Thus, we wrote QNodeit'', 1055 +a small compatibility layer that allows the complete suite of QUnit tests 1056 +to run offline, via Node.js. All non-user-interface tests run successfully in 1057 +both the browser and offline environments. QNodeit also integrates JSCheck into 1058 +QUnit, allowing JSCheck's randomized tests to be used naturally within QUnit. 1059 + 1060 +\subsection{Client-side implementation} 1061 + 1062 +Although Arend's client-side code only serves to provide a user-interface to 1063 +the backend's proof-checking and manipulation engine, it contains a significant 1064 +amount of logic itself. Early in Arend's development we felt that the best 1065 +course would be to implement the entire proof checking engine in Javascript, 1066 +thus allowing proof-handling with no server at all. Although this did not 1067 +prove feasible, a significant amount of logic-handling code in Javascript was 1068 +written, and, so far from being a redundancy, this has proved to be an asset. 1069 +Arend's client is not dumb'', but in fact understands a great deal of the 1070 +structure of the proofs it is presenting. This allows for richer user-interface 1071 +possibilities, and more flexible coordination between front- and back-end. 1072 + 1073 +The \link{https://lodash.com/}{Lodash} Javascript library provides a set of 1074 +general utilities, 1075 +mostly aimed at manipulation of collections (arrays and objects) and enabling 1076 +a functional style of programming; Lodash is used extensively throughout 1077 +Arend. Lodash makes its facilities available 1078 +as methods of a global \texttt{\_} object. Thus, filtering out the odd elements 1079 +of an array in stock'' Lodash would take the form 1080 +\begin{verbatim} 1081 +_.filter([1,2,3,4], function(e) { return e % 2 == 0; }); 1082 +\end{verbatim} 1083 + 1084 +We have created a wrapper library around Lodash called Nodash'' which 1085 +integrates Lodash's utility methods into the global prototypes of the datatypes 1086 +which they operate on. Thus, for example, the \texttt{filter} method, which 1087 +can be applied to any collection'' -- array, object, or string --- would 1088 +be installed on the global \texttt{Array.prototype}, \texttt{Object.prototype}, 1089 +and \texttt{String.prototype} objects, so that it is accessible simply as 1090 +\begin{verbatim} 1091 +[1,2,3,4].filter(function(e) { return e % 2 == 0; }); 1092 +\end{verbatim} 1093 +\marginnote[-1in]{Historically, monkey-patching'' the global prototypes was 1094 +considered highly unsafe, as new properties would be enumerable'' and would 1095 +thus become visible in, for example, \texttt{for-in} loops over the properties 1096 +of \emph{any} object. However, all modern Javascript implementations support 1097 +the \texttt{defineProperty} method, which allows the creation of 1098 +\emph{non-enumerate} properties on objects which do not have this problem. 1099 +Nodash uses \texttt{defineProperty} to safely add Lodash's methods to the 1100 +global prototypes.} 1101 +Note that in Nodash the global \texttt{\_} object is still available, for 1102 +those methods which do not fit with any of the global prototypes. 1103 + 1104 +The client-side implementation includes a complete parser for the 1105 +specification language. This allows user input to be fully syntax-checked 1106 +before being sent to the server, and allows for faster feedback to the user 1107 +when syntax errors occur. The parser is written using 1108 +\link{http://pegjs.org/}{PEG.js}, a parsing expression 1109 + grammar (PEG) \mcitep{Ford:2004:PEG:964001.964011} parser-generator. The 1110 +specification grammar in PEG form consists of \lnum{28} non-terminals and 1111 +\lnum{53} terminal tokens. Of note is the fact that the grammar uses the 1112 +standard 1113 +Unicode character classes in its definition of identifier'' and operator''; 1114 +an operator, in particular, is defined as any sequence of characters from the 1115 +symbolic or punctuation classes\footnote{Classes Sc, Sm, Sk, So, Pc, Pd, and 1116 +Po.}. This allows for traditional mathematical operators such as \in or 1117 +\leq to be used directly in specifications. 1118 + 1119 +Other modules of note in the client-side implementation include: 1004 1120 \begin{itemize} 1005 -\item jQuery 1121 +\item \texttt{core} --- contains a small 1122 + amount of compatibility code that allows the other non-user-interface 1123 + modules to operate transparently in either the browser or Node. In 1124 + particular, it provides a browser implementation of the \texttt{require()} 1125 + function in Node, used to load modules. In the browser, these modules must 1126 + have already been loaded via standard \texttt{<script>} tags (i.e., dynamic 1127 + loading is not provided), but, once loaded, they are installed into a 1128 + global module repository; \texttt{require} then simply returns a reference 1129 + to the appropriate module. This allows Javascript code to transparently 1130 + access other modules, without knowing whether they are being dynamically 1131 + loaded within Node, or have already been loaded in the browser. 1006 1132 1007 -\item Lo-Dash 1133 +\item \texttt{terms} --- contains a complete representation of \emph{terms}, 1134 + the fundamental datatype of logic programming. The \texttt{terms} module 1135 + supports walking the structure of terms, converting Prolog-style 1136 + term-lists to Javascript Arrays and vice versa, rendering terms to either 1137 + specification or Prolog-compatible strings, and enumerating the variables 1138 + in a term. 1008 1139 1009 - No-dash, our Lo-dash wrapper 1140 +\item \texttt{unify} --- contains a fully-functional implementation of the 1141 + Robinson unification algorithm \mcitep{robinson1965machine}. Although 1142 + this module is exhaustively tested, it is currently minimally used. In the 1143 + future, we hope to build a pattern-matching utility library on it, to allow 1144 + for more natural examination and manipulation of term-structures on the 1145 + client-side. 1010 1146 1011 -\item jsCheck 1012 - 1013 -\item PEG.js 1014 - 1015 -\item qunit 1147 +\item \texttt{term\_render} --- contains the rendering engine for converting 1148 + terms, rules, and derivations to HTML structures. Note that derivation 1149 + rendering is extensible via a number of hook'' functions, which allow 1150 + client code to extend or replace the rendering of the various 1151 + proof components: antecedents, consequents, disjunctive consequents, 1152 + and consequents which could be the target of the inductive hypothesis. 1016 1153 \end{itemize} 1017 1154 1018 -Module structure: 1019 -\begin{itemize} 1020 -\item Browser/Node compatibiltiy layer 1021 -\item terms 1022 -\item unify 1023 -\item parser + spec grammar 1024 -\item term rendering 1025 -\end{itemize} 1155 +\subsection{Server-side implementation} 1026 1156 1157 +The core of Arend's proof-checking engine is written in Prolog, and is made 1158 +available to the client via a HTTP interface. SWI-Prolog provides both a 1159 +standard HTTP server module, as well as a specialized app engine'' module 1160 +called \textsc{Pengines}, both of which are used in Arend. HTML, CSS, and 1161 +Javascript files for the client-side interface are served via the standard 1162 +HTTP server, while two PEngine applications'', \texttt{repl} and 1163 +\texttt{passist} provide the interface to the proof-checker itself. 1027 1164 1028 -\subsection{Server-based implementation (SWI-Prolog)} 1029 - 1030 -Lines of code, for the HTTP server component and the checker component, 1031 -separately. Hopefully the LoC for the checker will be significantly less than 1032 -that for the JS implementation! If we get any tests running, number of test 1033 -cases. 1034 - 1035 -(Development details? Num. of files, SCM used, total hours, etc.) 1165 +PEngines provides a transparent interface between Javascript code running 1166 +in the browser, and Prolog code running on the server. With it, our 1167 +client-side code can directly execute queries against the exported 1168 +predicates of the two aforementioned applications. The results (success, 1169 +failure, and substitution) of those queries can then be enumerated. Terms 1170 +are encoded as JSON objects (our \texttt{terms} module can decode JSON 1171 +terms into its own types). 1036 1172 1037 1173 It should be noted that the Prolog implementation of Arend is mostly portable, 1038 1174 relying only on ISO-Prolog predicates, commonly-available libraries, and a few 1039 -SWI-Prolog-specific extensions (mostly dealing with limiting the depth of search 1040 -trees). 1175 +SWI-Prolog-specific extensions and modules (most notably the HTTP server 1176 +module). It would not be difficult to port the proof-checker itself to another 1177 +Prolog system. 1041 1178 1042 -It is interesting to directly compare the two implementations of unification 1043 -(one of the few modules which is semantically similar in both 1044 -implementations). Both systems implement the Robinson unification 1045 -algorithm \mcitep{robinson1965machine}. The Javascript implementation consists of 1046 -136 lines of non-comment code and is quite difficult to follow; the Prolog 1047 -implementation requires only 69 lines, 1048 -an almost 50\% reduction! This despite the fact that the Prolog implementation 1049 -does \emph{not} use the built-in unification to simplify the algorithm at all; 1050 -it would be largely the same if only traditional assignment were used. 1051 - 1052 -Our future goal is for these two implementations to be built off the same 1053 -underlying codebase, for example, by moving the client implementation towards 1054 -a full implementation of the Warren Abstract Machine \mcitep{warren1983abstract}. 1055 -This would allow the 1056 -Prolog code currently used in the server implementation to transparently run 1057 -in a JavaScript environment, whether the browser or one of the aforementioned 1058 -offline engines. 1059 - 1060 -Module structure: 1179 +The Prolog core proof-checker implementation consists of the following modules, 1180 +the functionality of which will be described in detail in the following 1181 +sections: 1061 1182 \begin{itemize} 1062 -\item explicit substitutions 1063 -\item specification execution 1064 -\item proof checking and elaboration 1065 -\item server, Pengines, etc. 1183 +\item \texttt{subst} --- provides support for working with \emph{explicit 1184 + substitutions}. As described below, Arend cannot use Prolog's own 1185 + substitution, as the substitutions that are applied to a proof object 1186 + may differ in subtrees; Prolog applies a substitution globally. Thus, we 1187 + re-implement both variables and substitutions for our own use. 1188 + 1189 +\item \texttt{program} --- provides support for expanding atomic goals in the 1190 + specification language, and for executing goals in the specification. 1191 + The \texttt{run/3} predicate produces proof objects compatible with those 1192 + produced by the full proof checker, but is an independent implementation. 1193 + 1194 +\item \texttt{checker} --- the heart of the proof checking system, provides 1195 + routines for dealing with proof objects, elaborating incomplete proofs, 1196 + and checking and generating proof objects corresponding to particular 1197 + statements. 1066 1198 \end{itemize} 1067 1199 1068 1200 \subsection{Proof representation} 1069 1201 1070 1202 Informally, the content of a derivation is relatively simple: a tree of 1071 1203 \textsc{-Left} and/or \textsc{-Right} rule applications, drawn from the rules 1072 1204 of the reasoning logic in figure \ref{reasoning-rules}. In practice, a more 1073 -detailed representation is required, one which, in particular, necessates the 1205 +detailed representation is required, one which, in particular, necessitates the 1074 1206 use of \emph{explicit substitution}, rather than the implicit substitution that 1075 1207 would result from naive use of (for example) the system unification in Prolog. 1076 1208 1077 -For derivations purely derived from the specifcation language, explicit 1209 +For derivations purely derived from the specification language, explicit 1078 1210 substitution is not required. This is because for any valid derivation of a 1079 1211 proposition expressed in the specification language, the substitution is 1080 1212 \emph{consistent} throughout the derivation tree; it is not possible for 1081 1213 different branches'' to have different substitutions. (Disjunction may, of 1082 1214 course, result in the generation of multiple distinct derivations, each with its 1083 1215 own unifying substitution, but within any single derivation there is always a 1084 1216 single consistent substitution.) 1085 1217 1086 -However, because the reasoning language posesses rightward implication, a new 1087 -element is introduced to derivations: the list of \emph{antecedants}. Case 1088 -analysis on a disjunction antecedant is superficially similar applying the 1218 +However, because the reasoning language possesses rightward implication, a new 1219 +element is introduced to derivations: the list of \emph{antecedents}. Case 1220 +analysis on a disjunction antecedent is superficially similar applying the 1089 1221 \land-R rule, with an important distinction: the branches produced each have 1090 1222 \emph{independent} bindings. Consider, for example, case analysis on 1091 1223 \mathsf{nat}(X) in the course of a proof: 1092 1224 \[ 1093 1225 \inference[] 1094 1226 {X = z, \mathsf{nat}(z) |- \ldots & 1095 1227 X = s(X'), \mathsf{nat}(X') |- \ldots} ................................................................................ 1102 1234 inconsistent, and actually impossible to express, because Horn clauses forbid 1103 1235 such conjunctions as conclusions. Conversely, given the conclusion 1104 1236 X = z \lor X = s(x') we have 1105 1237 the choice of which unification to use, and thus there is no inconsistency.} 1106 1238 1107 1239 Because of this, each branch of a derivation must maintain its own substitution, 1108 1240 applied to its goals as they are expanded. 1241 + 1242 +Proofs are represented as a term of the form 1243 +\begin{verbatim} 1244 +proof(Goal,Ctx,Subst,Proof) 1245 +\end{verbatim} 1246 +where \texttt{Goal} is the goal to be proved, \texttt{Ctx} is the list of 1247 +antecedents (initially empty for most goals), \texttt{Subst} is the 1248 +substitution that results if the goal is proved, and \texttt{Proof} is a 1249 +proof term for this particular goal. 1250 + 1251 +The proof terms vary, depending on the structure of the goal and the contents 1252 +of the context. Non-axiomatic proof terms include one or more 1253 +sub-proofs. The possible proof terms are 1254 +\begin{itemize} 1255 +\item \texttt{induction(N,Proof)} --- proves a \forall inductively, on 1256 + the N-th element of the context. 1257 +\item \texttt{ih(Proof)} --- Proves an atomic goal by backchaining it against 1258 + the inductive hypothesis. 1259 +\item \texttt{generic(Proof)} --- Proves a \forall generically (i.e., by 1260 + substituting a unique constant for the quantified variable). 1261 +\item \texttt{instan(V,Proof)} --- Proves a \exists by giving a value for 1262 + the quantified variable. 1263 +\item \texttt{product([Proofs...])} --- Proves a conjunction by providing a 1264 + subproof for each conjunct. 1265 +\item \texttt{choice(N,Proof)} --- Proves the N-th branch of a disjunction. 1266 +\item \texttt{implies(Proof)} --- Proves an implication, by adding the 1267 + left-hand-side to the context as an assumption and then proving the 1268 + right-hand-side. 1269 +\item \texttt{ctxmem(N)} --- Implements the assumption rule''; i.e., proves 1270 + an atomic goal by unifying it with the N-th element of the context. 1271 +\item \texttt{backchain(Proof)} --- Proves an atomic goal by backchaining it; 1272 + i.e., by expanding it into its definition, the disjunction of its clauses. 1273 +\item \texttt{unify(A,B)} --- Proves a goal of the form \texttt{A = B} 1274 + (this has the side-effect of adding the substitution produced by 1275 + A = B to the \texttt{Subst} for this proof subtree). 1276 +\item \texttt{case(T,N,Keep,[Proofs...])} --- Performs case-analysis on an 1277 + element of the context. The type \texttt{T} corresponds to the 1278 + various -Left rules of the reasoning logic. 1279 +\item \texttt{trivial} --- Proves \top. 1280 +\item \texttt{hole} --- Proves any goal, but represents an incomplete proof. 1281 +\end{itemize} 1109 1282 1110 1283 \subsection{Capture avoidance} 1111 1284 1112 1285 The use of explicit substitution means that the proof-checker must also contend 1113 1286 with another troublesome problem which Prolog conceals: avoiding \emph{accidental 1114 1287 capture} when expanding a call to a goal. Consider the goal \mathsf{nat}(X). 1115 1288 We would expect this goal to succeed twice, first with the solution X = z. ................................................................................ 1139 1312 X \mapsto Y to \forall Y, P(X,Y); clearly, one of the Y's will need to 1140 1313 be renamed. Since the Y in the substitution is most likely bound somewhere 1141 1314 else in the derivation, we choose to rename the Y bound within the \forall. 1142 1315 1143 1316 \subsection{Unification} 1144 1317 1145 1318 The Robinson unification algorithm is presented, in natural deduction style, 1146 - in figure 1147 -\ref{unification-rules}. Note that the same algorithm is used, with minor 1319 + in figure~\ref{unification-rules}. 1320 + Note that the same algorithm is used, with minor 1148 1321 modification, in both the Javascript and Prolog implementations (for a direct 1149 -comparison of these two implementations, see comments below). 1322 +comparison of these two implementations, see our comments below). 1150 1323 1151 1324 \begin{figure}[h!] 1152 1325 \[ 1153 1326 \inference[Wildcard] 1154 1327 {} 1155 1328 {\_ \equiv t \mid \varepsilon} 1156 1329$
................................................................................
1184   1357       {}
1185   1358       {\mathtt{[]} \equiv \mathtt{[]} \mid \varepsilon}
1187   1360   \inference[Cons]
1188   1361       {t \equiv s \mid \theta_1  &  \hat{t} \theta_1 \equiv \hat{s} \theta_1 \mid \theta_2}
1189   1362       {(t:\hat{t}) \equiv (s:\hat{s}) \mid \theta_2}
1190   1363   \]
1364  +\caption{Rules for the Robinson unification algorithm}
1191   1365   \label{unification-rules}
1192   1366   \end{figure}
1193   1367
1194   1368   (This presentation is based in part on that in \mcitet{pfenning2006logic}.)
1369  +
1370  +It is interesting to directly compare the two implementations of unification
1371  +(one of the few modules which is semantically similar in both
1372  +implementations). Both systems implement the same algorithm. The Javascript
1373  +implementation consists of
1374  +136 lines of non-comment code and is quite difficult to follow; the Prolog
1375  +implementation requires only 69 lines,
1376  +an almost 50\% reduction! This despite the fact that the Prolog implementation
1377  +does \emph{not} use the built-in unification to simplify the algorithm at all;
1378  +it would be largely the same if only traditional assignment were used.
1195   1379
1196   1380
1197   1381   \subsection{Proof Checking}
1198   1382
1199   1383   The proof checker is implemented in approximately 450 lines of Prolog code. It
1200   1384   essentially implements the rules of the reasoning logic presented in figure
1201   1385   \ref{reasoning-rules}, by recursively checking the proof tree. That is, it
1202   1386   first checks the root node to ensure that the proof object is consistent with
1203         -the conclusion (premises and succedant) by ensuring that it has the correct
1387  +the conclusion (antecedents and consequent) by ensuring that it has the correct
1204   1388   number and type of subproofs. The subproofs are then recursively checked,
1205   1389   working through the tree until the axiomatic rules are reached (or, in the
1206   1390   case of an incomplete proof, a \texttt{hole} is found, indicating an unproved
1207   1391   subtree).
1208   1392
1209   1393   There are two complications to a straightforward top-down recursive
1210   1394   implementation:
................................................................................
1253   1437   \subsection{Proof Elaboration}
1254   1438
1255   1439   The heart of the incremental proof checking algorithm is the predicate
1256   1440   \texttt{elaborate}  (defined in file \texttt{checker.pl}). \texttt{elaborate}
1257   1441   works in conjunction with \texttt{check}, the proof checker itself. The purpose
1258   1442   of elaborate is to take an incomplete proof, together with a reference to one
1259   1443   of its leaves (i.e., to a $?$), and to elaborate'' it with respect to either
1260         -its succedant, or one of its premises. As described above, the form of the
1444  +its consequent, or one of its antecedents. As described above, the form of the
1261   1445   selected element dictates the forms of the subproofs. \texttt{elaborate}
1262   1446   fills in'' these subproofs, to a single level only (i.e., all the subproofs
1263   1447   it constructs are themselves $?$).
1264   1448
1265   1449   In a logic which did not include unification or the ability to call defined
1266   1450   predicates (atomic goals), we could elaborate any node of the tree,
1267   1451   independent of the rest of the tree.
................................................................................
1454   1638
1455   1639   \begin{figure}[h!]
1456   1640
1457   1641   All left-rules instantiate the Proof to a term of the form
1458   1642   $1459 1643 \text{case}(\text{Type},N,\text{Keep},[\text{Proofs}\ldots]) 1460 1644$
1461         -In the rules below we leave $N$ (the index in the list of antecedants of the
1645  +In the rules below we leave $N$ (the index in the list of antecedents of the
1462   1646   judgment to be cased upon) implicit and omit Keep (a Boolean flag indicating
1463         -whether the targeted antecedant should remain in the list, or be
1647  +whether the targeted antecedent should remain in the list, or be
1464   1648   removed), and write them as
1465   1649   $1466 1650 \text{Type}(\text{Proofs}\ldots) 1467 1651$
1468   1652
1469   1653   $1470 1654 \inference[False-Left] ................................................................................ 1553 1737 it has the form P \land Q, then the output proof will be elaborated to 1554 1738 \[ 1555 1739 \proof{P \land Q}{\Gamma}{\text{product}([ 1556 1740 \proof{P}{\Gamma}{\text{hole}}, 1557 1741 \proof{Q}{\Gamma}{\text{hole}} 1558 1742 ])} 1559 1743$
1560         -The subproofs of the product will have the correct succedants and premises,
1744  +The subproofs of the product will have the correct consequents and antecedents,
1561   1745   but their own proofs will be empty, ready for further elaboration.
1562   1746
1563   1747
1564   1748   \subsection{Inductive reasoning}
1565   1749
1566   1750   Arend's reasoning logic supports proofs of a universal quantification both
1567   1751   generically and by \emph{induction}. Arend's induction is technically on the
1568   1752   height of derivations, although from the perspective of the user it supports
1569   1753   full structural induction on terms; this subsumes the usual natural number
1570   1754   induction.
1571   1755
1572   1756   Arend supports only single-induction proofs (i.e., induction on
1573         -multiple premisses is not allowed) and supports only induction global to a
1757  +multiple antecedents is not allowed) and supports only induction global to a
1574   1758   proof (i.e., nested
1575   1759   inductions are not allowed, although they can be faked'' by using lemmas).
1576   1760   These restrictions imply that the induction hypothesis can be regarded as
1577   1761   being global to a proof, thus eliminating the need to restrict the scope
1578   1762   of difference induction hypotheses to different branches of the proof tree.
1579   1763
1580   1764   Internally, induction is implemented by \emph{goal tagging}. When an inductive
1581         -proof is declared, a particular goal in the antecedants is selected, by the
1765  +proof is declared, a particular goal in the antecedents is selected, by the
1582   1766   user, to be the target of the induction.\marginnote{For example, in a proof of
1583   1767   $\mathsf{nat}(X) |- \mathsf{add}(X,0,X)$ we would induct on $\mathsf{nat}(X)$.}
1584   1768   This goal must be a user goal; it cannot be a built-in operator such as
1585   1769   conjunction, disjunction, or unification. The functor of the goal is internally
1586   1770   flagged as being big'' (indicated as $\uparrow$) and the induction
1587   1771   hypothesis is defined in terms of the same goal, but flagged as
1588         -small'' ($\downarrow$). For example, to prove that $\mathsf{nat}(X) -> \mathsf{add}(X,0,X)$ our proof would proceed by induction on $\mathsf{nat}(X)$, and
1772  +small'' ($\downarrow$). For example, to prove that
1773  +$\mathsf{nat}(X) -> \mathsf{add}(X,0,X)$ our proof would proceed
1774  +by induction on $\mathsf{nat}(X)$, and
1589   1775   thus we would have the inductive hypothesis
1590   1776   $\mathsf{nat}^\downarrow(X) -> \mathsf{add}(X,0,X)$.
1591   1777
1592   1778   When a $\uparrow$ goal is expanded by case analysis or backchaining,
1593   1779    any sub-goals in its expansion
1594   1780   are flagged as $\downarrow$, indicating that they are smaller'' than the original
1595   1781   goal. The induction hypothesis can only be applied to goals which are small'',
................................................................................
1673   1859   being a flaw in our system, this surprising result reflects the fact that one
1674   1860   computational interpretation of $\bot$ is that of a non-terminating computation.
1675   1861   In fact, this result is simply the computational analogue of the well-known
1676   1862   \emph{ex falso quodlibet} principle of classical logic.
1677   1863
1678   1864   \clearpage
1679   1865   \section{Future work}
1866  +
1867  +There are many areas in which Arend could be enhanced and extended. We
1868  +examine ten possibilities here, including one, equational reasoning, in some
1869  +depth.
1680   1870
1681   1871   \begin{itemize}
1682   1872   \item Arend's specification logic is simple enough that formalization of
1683   1873       its properties should not be difficult, nonetheless, for formal correctness
1684   1874       and consistency
1685   1875       work needs to be done to show that it is both \emph{sound} (everything it
1686   1876       proves true is true) and \emph{non-deterministically complete} (when it
................................................................................
1758   1948       conclusion. Since one of the difficulties of the derivation-tree proof
1759   1949       format is its often-extravagant use of horizontal space, it should be
1760   1950       possible to automatically \emph{extract} a sub-proof as a lemma. This
1761   1951       could be done even when the lemma is not generally useful, but simply
1762   1952       as a way of naming'' a portion of the proof or commenting on its
1763   1953       purpose.
1764   1954
1765         -\item The implementaiton of Arend is primitively minimal, suitable only for
1955  +\item The implementation of Arend is primitively minimal, suitable only for
1766   1956       small-scale usage. Real-world usage would require integration with existing
1767   1957       grading systems and automatic checks of student-submitted proofs. On a higher
1768   1958       level we could easily imagine a set of features aimed at
1769   1959       creating an ecosystem around student-created specifications, proofs, and
1770   1960       lemmas, perhaps with collaboration features for multi-student use.
1771   1961
1772   1962   \item Although we believe Arend's derivation-tree presentation of proofs has
................................................................................
1801   1991   \subsection{Equational reasoning}
1802   1992
1803   1993   One very large extension to the logic which we would like to investigate would
1804   1994   be the incorporation of \emph{equational reasoning}. Arend's current concept
1805   1995   of equality is based entirely on unification: two terms are equal if they can
1806   1996   be made equal by some substitution. Equational reasoning allows equality to
1807   1997   be defined via rules, for example \marginnote{For computational purposes equational rules are often regarded as
1808         -unidirectional \emph{rewite rules}: $a = b$ becomes $a \mapsto b$.}
1998  +unidirectional \emph{rewrite rules}: $a = b$ becomes $a \mapsto b$.}
1809   1999
1810   2000   $1811 2001 \inference[=-Refl] 1812 2002 {} 1813 2003 {x = x} 1814 2004 \qquad 1815 2005 \inference[=-Symm] ................................................................................ 1830 2020 \inference[\times-dist-+] 1831 2021 {} 1832 2022 {a \times (b + c) = a \times b + a \times c} 1833 2023$
1834   2024
1835   2025   Equational reasoning is a particularly powerful mechanism for reasoning about
1836   2026   \emph{functional programs} and programming languages; we can state
1837         -equivalencies between expressions and
2027  +equivalences between expressions and
1838   2028   then prove that two programs, or two classes of programs, are equivalent under
1839   2029   those rules. Even more powerfully, if we wish to prove some property of an
1840         -entire language, it is sufficient to show that the equivalencies preserve that
2030  +entire language, it is sufficient to show that the equivalences preserve that
1841   2031   property.
1842   2032
1843   2033   Although the additional of equational reasoning to Arend would give the system
1844   2034   a marked increase in logical power, it would also correspondingly increase the
1845   2035   complexity of its implementation and presentation. Equivalence rules require
1846   2036   a different treatment of terms and variables from unification, and it is not
1847   2037   entirely clear how the two will interact, logically. Equivalences over programs
................................................................................
1850   2040   logic variables, but can both contain, and be contained by, them. Although
1851   2041   logics exist which provide support for reasoning about binding structures
1852   2042   \mcitep{miller05tocl} their usage in proofs involves additional complexity that
1853   2043   may be at odd with our goal of use in \emph{introductory} curriculum.
1854   2044
1855   2045   \clearpage
1856   2046   \section{Appendix I: Syntax  of the Specification Logic}
1857         -\label{spec-syntax}
2047  +\label{sec:spec-syntax}
1858   2048
1859   2049   \newcommand{\nterm}[1]{\mathop{\langle\text{#1}\rangle}}
1860   2050   \newcommand{\sterm}[1]{\mathop{\text{\emph{#1}}}}
1861   2051   \newcommand{\term}[1]{\mathop{\text{\texttt{#1''}}}}
1862   2052
1863   2053   \newthought{The following is a simplified presentation} of the grammar of the
1864   2054   specification
1865   2055   language. In particular, precedence rules for infix operators have been
1866   2056   omitted, but are consistent with normal usage. Note that while the grammar
1867   2057   includes rules defining infix arithmetic and comparison operators, these
1868   2058   operators have no \emph{semantic} significance within the specification logic.
1869   2059   They are present under the assumption that specifications may want to use them,
1870         -and will expect them to have their normal precendence ordering.
2060  +and will expect them to have their normal precedence ordering.
1871   2061
1872   2062   \begin{align*}
1873   2063   \nterm{start} &<- \nterm{definitions}? \\
1874   2064   \\
1875   2065   \nterm{definitions} &<- \nterm{definition} \nterm{definitions}* \\
1876   2066   \nterm{definition}  &<- \sterm{Rulename}? \nterm{predicate} \\
1877   2067                       &\mid \nterm{infix-decl} \\
................................................................................
1900   2090                   &\mid \term{(} \nterm{term} \term{)} \\
1901   2091                   &\mid \sterm{Atom} \\
1902   2092                   &\mid \sterm{Variable} \\
1903   2093   \\
1904   2094   \nterm{compound} &<- \sterm{Atom} \term{(} \nterm{termlist} \term{)} \\
1905   2095   \end{align*}
1906   2096
1907         -\subsection{Operational Semantics}\label{spec-execute}
2097  +%\subsection{Operational Semantics}\label{spec-execute}
1908   2098
1909         -This may or may not get written. If it does, it will be based on the
1910         -stack-based semantics of \mcitet{pfenning2006logic}.
2099  +%This may or may not get written. If it does, it will be based on the
2100  +%stack-based semantics of \mcitet{pfenning2006logic}.
1911   2101
1912   2102   \clearpage
1913   2103   \section[Appendix II: Specification for N, lists]{Appendix II: Specification for $\mathbb{N}$, lists}
1914   2104   \label{sec:nat-spec}
1915   2105
1916   2106   \newthought{The following is an example specification} included with Arend,
1917   2107   one which