chapter\<open>Outer syntax --- the theory language \label{ch:outer-syntax}\<close>
text\<open>
The rather generic framework of Isabelle/Isar syntax emerges from three main
syntactic categories: \<^emph>\<open>commands\<close> of the top-level Isar engine (covering theoryandproof elements), \<^emph>\<open>methods\<close> for general goal refinements
(analogous to traditional ``tactics''), and\<^emph>\<open>attributes\<close> for operations on
facts (within a certain context). Subsequently we give a reference of basic
syntactic entities underlying Isabelle/Isar syntaxin a bottom-up manner.
Concrete theoryandproof language elements will be introduced later on.
\<^medskip> In order to get started with writing well-formed Isabelle/Isar documents,
the most important aspect to be noted is the difference of \<^emph>\<open>inner\<close> versus \<^emph>\<open>outer\<close> syntax. Inner syntax is that of Isabelle types and terms of the
logic, while outer syntaxis that of Isabelle/Isar theory sources
(specifications and proofs). As a general rule, inner syntax entities may
occur only as \<^emph>\<open>atomic entities\<close> within outer syntax. For example, the
string \<^verbatim>\<open>"x + y"\<close> and identifier \<^verbatim>\<open>z\<close> are legal term specifications within a theory, while \<^verbatim>\<open>x + y\<close> without quotes is not.
Printed theory documents usually omit quotes to gain readability (this is a
matter of {\LaTeX} macro setup, say via \<^verbatim>\<open>\isabellestyle\<close>, see also \<^cite>\<open>"isabelle-system"\<close>). Experienced users of Isabelle/Isar may easily
reconstruct the lost technical information, while mere readers need not care
about quotes at all. \<close>
text\<open>
The outer lexical syntax consists of three main categories of syntax tokens:
\<^enum> \<^emph>\<open>major keywords\<close> --- the command names that are available in the present logic session;
\<^enum> \<^emph>\<open>minor keywords\<close> --- additional literal tokens required by the syntax of commands;
\<^enum> \<^emph>\<open>named tokens\<close> --- various categories of identifiers etc.
Major keywords and minor keywords are guaranteed to be disjoint. This helps
user-interfaces to determine the overall structure of a theorytext, without
knowing the full details of command syntax. Internally, there is some
additional information about the kind of major keywords, which approximates
the command type (theory command, proof command etc.).
Keywords override named tokens. For example, the presence of a command
called \<^verbatim>\<open>term\<close> inhibits the identifier \<^verbatim>\<open>term\<close>, but the string \<^verbatim>\<open>"term"\<close> can
be used instead. By convention, the outer syntax always allows quoted
strings in addition to identifiers, wherever a named entity is expected.
When tokenizing a given input sequence, the lexer repeatedly takes the
longest prefix of the input that forms a valid token. Spaces, tabs, newlines and formfeeds between tokens serve as explicit separators.
\<^medskip>
The categories for named tokens are defined once andfor all as follows.
A @{syntax_ref term_var} or @{syntax_ref type_var} describes an unknown,
which is internally a pair of base name and index (ML type \<^ML_type>\<open>indexname\<close>). These components are either separated by a dot as in \<open>?x.1\<close> or \<open>?x7.3\<close> or run together as in \<open>?x1\<close>. The latter form is possible if the base
name does not endwith digits. If the index is 0, it may be dropped
altogether: \<open>?x\<close> and \<open>?x0\<close> and \<open>?x.0\<close> all refer to the same unknown, with
basename \<open>x\<close> and index 0.
The syntax of @{syntax_ref string} admits any characters, including
newlines; ``\<^verbatim>\<open>"\<close>'' (double-quote) and ``\<^verbatim>\<open>\\<close>'' (backslash) need to be
escaped by a backslash; arbitrary character codes may be specified as
``\<^verbatim>\<open>\\<close>\<open>ddd\<close>'', with three decimal digits. Alternative strings according to
@{syntax_ref altstring} are analogous, using single back-quotes instead.
The body of @{syntax_ref verbatim} may consist of any text not containing
``\<^verbatim>\<open>*}\<close>''; this allows to include quotes without further escapes, but there is no way to escape ``\<^verbatim>\<open>*}\<close>''. Cartouches do not have this limitation.
A @{syntax_ref cartouche} consists of arbitrary text, with properly balanced
blocks of ``@{verbatim "\"}~\\\~@{verbatim "\"}''. Note that the rendering
of cartouche delimiters is usually like this: ``\<open>\<open> \<dots> \<close>\<close>''.
Source comments take the form \<^verbatim>\<open>(*\<close>~\<open>\<dots>\<close>~\<^verbatim>\<open>*)\<close> and may be nested: the text is
removed after lexical analysis of the input andthus not suitable for
documentation. The Isar syntaxalso provides proper \<^emph>\<open>document comments\<close>
that are considered as part of the text (see \secref{sec:comments}).
Common mathematical symbols such as \<open>\<forall>\<close> are represented in Isabelle as \<^verbatim>\<open>\<forall>\<close>.
There are infinitely many Isabelle symbols like this, although proper
presentation is left to front-end tools such as {\LaTeX} or Isabelle/jEdit.
A list of predefined Isabelle symbols that work well with these tools is
given in\appref{app:symbols}. Note that \<^verbatim>\<open>\<lambda>\<close> does not belong to the \<open>letter\<close> category, since it is already used differently in the Pure term
language. \<close>
section \<open>Common syntax entities\<close>
text\<open>
We now introduce several basic syntactic entities, such as names, terms, and theorem specifications, which are factored out of the actual Isar language
elements to be described later. \<close>
subsection \<open>Names\<close>
text\<open>
Entity @{syntax name} usually refers to any name of types, constants, theorems etc.\ Quoted strings provide an escape for non-identifier names or
those ruled out by outer syntax keywords (e.g.\ quoted \<^verbatim>\<open>"let"\<close>).
A @{syntax_def system_name} is like @{syntax name}, but it excludes
white-space characters and needs to conform to file-name notation. Name
components that are special on Windows (e.g.\ \<^verbatim>\<open>CON\<close>, \<^verbatim>\<open>PRN\<close>, \<^verbatim>\<open>AUX\<close>) are
excluded on all platforms. \<close>
subsection \<open>Numbers\<close>
text\<open>
The outer lexical syntax (\secref{sec:outer-lex}) admits natural numbers and
floating point numbers. These are combined as @{syntax int} and @{syntax
real} as follows.
Note that there is an overlap with the category @{syntax name}, which also includes @{syntax nat}. \<close>
subsection \<open>Embedded content\<close>
text\<open>
Entity @{syntax embedded} refers to content of other languages: cartouches
allow arbitrary nesting of sub-languages that respect the recursive
balancing of cartouche delimiters. Quoted strings are possible as well, but
require escaped quotes when nested. As a shortcut, tokens that appear as
plain identifiers in the outer language may be used as inner language
content without delimiters.
text\<open>
A chunk of document @{syntaxtext} is usually given as @{syntax cartouche} \<open>\<open>\<dots>\<close>\<close>. For convenience, any of the smaller text unit that conforms to
@{syntax name} is admitted as well.
Typical uses are document markup commands, like \<^theory_text>\<open>chapter\<close>, \<^theory_text>\<open>section\<close> etc.
(\secref{sec:markup}). \<close>
text\<open>
Formal comments are an integral part of the document, but are logically void and removed from the resulting theory or term content. The output of
document preparation (\chref{ch:document-prep}) supports various styles,
according to the following kinds of comments.
\<^item> Marginal comment of the form \<^verbatim>\<open>\<comment>\<close>~\<open>\<open>text\<close>\<close> or \<open>\<comment>\<close>~\<open>\<open>text\<close>\<close>, usually with
a single space between the comment symbol and the argument cartouche. The
given argument is typeset as regular text, with formal antiquotations
(\secref{sec:antiq}).
\<^item> Canceled text of the form \<^verbatim>\<open>\<^cancel>\<close>\<open>\<open>text\<close>\<close> (no white space between the
control symbol and the argument cartouche). The argument is typeset as
formal Isabelle source and overlaid with a ``strike-through'' pattern,
e.g. \<^theory_text>\<open>\<^cancel>\<open>bad\<close>\<close>.
\<^item> Raw {\LaTeX} source of the form \<^verbatim>\<open>\<^latex>\<close>\<open>\<open>argument\<close>\<close> (no white space
between the control symbol and the argument cartouche). This allows to
augment the generated {\TeX} source arbitrarily, without any sanity
checks!
These formal comments work uniformly in outer syntax, inner syntax (term
language), Isabelle/ML, and some other embedded languages of Isabelle. \<close>
subsection \<open>Type classes, sorts and arities\<close>
text\<open> Classes are specified by plain names. Sorts have a very simple inner syntax,
which is either a single class name \<open>c\<close> or a list \<open>{c\<^sub>1, \<dots>, c\<^sub>n}\<close> referring to the intersection of these classes. The syntax of type aritiesis given
directly at the outer level.
subsection \<open>Types and terms \label{sec:types-terms}\<close>
text\<open>
The actual inner Isabelle syntax, that of typesand terms of the logic, is
far too sophisticated in order to be modelled explicitly at the outer theory
level. Basically, any such entity has to be quoted to turn it into a single
token (the parsing and type-checking is performed internally later). For
convenience, a slightly more liberal convention is adopted: quotes may be
omitted for any type or term that is already atomic at the outer level. For
example, one may just write \<^verbatim>\<open>x\<close> instead of quoted \<^verbatim>\<open>"x"\<close>. Note that
symbolic identifiers (e.g.\ \<^verbatim>\<open>++\<close> or \<open>\<forall>\<close> are available as well, provided
these have not been superseded by commands or other keywords already (such
as \<^verbatim>\<open>=\<close> or \<^verbatim>\<open>+\<close>).
Named instantiations are specified as pairs of assignments \<open>v = t\<close>, which
refer to schematic variables in some theorem that is instantiated. Both type and terms instantiations are admitted, and distinguished by the usual syntax
of variable names.
Type declarations and definitions usually refer to @{syntax typespec} on the
left-hand side. This models basic type constructor application at the outer syntax level. Note that only plain postfix notationis available here, but
no infixes.
subsection \<open>Term patterns and declarations \label{sec:term-decls}\<close>
text\<open>
Wherever explicit propositions (or term fragments) occur in a prooftext,
casual binding of schematic term variables may be given specified via
patterns of the form ``\<^theory_text>\<open>(is p\<^sub>1 \<dots> p\<^sub>n)\<close>''. This works both for @{syntax term} and @{syntaxprop}.
\<^medskip>
Declarations of local variables \<open>x :: \<tau>\<close> and logical propositions \<open>a : \<phi>\<close>
represent different views on the same principle of introducing a local
scope. In practice, one may usually omit the typing of @{syntax vars} (due to type-inference), and the naming of propositions (due to implicit
references of current facts). In any case, Isar proof elements usually admit to introduce multiple such items simultaneously.
The treatment of multiple declarations corresponds to the complementary
focus of @{syntax vars} versus @{syntax props}. In ``\<open>x\<^sub>1 \<dots> x\<^sub>n :: \<tau>\<close>'' the
typing refers to all variables, while in\<open>a: \<phi>\<^sub>1 \<dots> \<phi>\<^sub>n\<close> the naming refers to
all propositions collectively. Isar language elements that refer to @{syntax
vars} or @{syntax props} typically admit separate typings or namings via
another level of iteration, with explicit @{keyword_ref "and"} separators;
e.g.\ see @{command "fix"} and @{command "assume"} in \secref{sec:proof-context}. \<close>
subsection \<open>Attributes and theorems \label{sec:syn-att}\<close>
text\<open>
Attributes have their own ``semi-inner''syntax, in the sense that input
conforming to @{syntax args} below is parsed by the attribute a second time.
The attribute argument specifications may be any sequence of atomic entities
(identifiers, strings etc.), or properly bracketed argument lists. Below
@{syntax atom} refers to any atomic entity, including any @{syntax keyword}
conforming to @{syntax sym_ident}.
Theorem specifications come in several flavors: @{syntax axmdecl} and
@{syntax thmdecl} usually refer toaxioms, assumptions or results of goal
statements, while @{syntax thmdef} collects lists of existing theorems.
Existing theorems are given by @{syntaxthm} and @{syntax thms}, the
former requires an actual singleton result.
There are three forms of theorem references:
\<^enum> named facts \<open>a\<close>,
\<^enum> selections from named facts \<open>a(i)\<close> or \<open>a(j - k)\<close>,
\<^enum> literal fact propositions using token syntax @{syntax_ref altstring} \<^verbatim>\<open>`\<close>\<open>\<phi>\<close>\<^verbatim>\<open>`\<close> or @{syntax_ref cartouche} \<open>\<open>\<phi>\<close>\<close> (see also method @{method_ref fact}).
Any kind of theoremspecification may include lists of attributes both on
the left and right hand sides; attributes are applied to any immediately
preceding fact. If names are omitted, the theorems are not stored within the theorem database of the theory or proofcontext, but any given attributes
are applied nonetheless.
An extra pair of brackets around attributes (like ``\<open>[[simproc a]]\<close>'')
abbreviates a theorem reference involving an internal dummy fact, which will
be ignored later on. So only the effect of the attribute on the background context will persist. This form of in-place declarations is particularly
useful with commands like @{command "declare"} and @{command "using"}.
text\<open>
Structured specifications use propositions with explicit notationfor the
``eigen-context''to describe rule structure: \<open>\<And>x. A x \<Longrightarrow> \<dots> \<Longrightarrow> B x\<close> is
expressed as \<^theory_text>\<open>B x if A x and \<dots> for x\<close>. It is also possible to use dummy
terms ``\<open>_\<close>'' (underscore) to refer to locally fixed variables anonymously.
Multiple specifications are delimited by ``\<open>|\<close>'' to emphasize separate
cases: each with its own scope of inferred typesfor free variables.
These commands print certain parts of the theoryandproofcontext. Note
that there are some further ones available, such as for the set of rules
declared for simplifications.
\<^descr> @{command "print_theory"} prints the main logical content of the
background theory; the ``\<open>!\<close>'' option indicates extra verbosity.
\<^descr> @{command "print_definitions"} prints dependencies of definitional
specifications within the background theory, which may be constants
(\secref{sec:term-definitions}, \secref{sec:overloading}) or types
(\secref{sec:types-pure}, \secref{sec:hol-typedef}); the ``\<open>!\<close>'' option
indicates extra verbosity.
\<^descr> @{command "print_methods"} prints all proof methods available in the
current theorycontext; the ``\<open>!\<close>'' option indicates extra verbosity.
\<^descr> @{command "print_attributes"} prints all attributes available in the
current theorycontext; the ``\<open>!\<close>'' option indicates extra verbosity.
\<^descr> @{command "print_theorems"} prints theorems of the background theory
resulting from the last command; the ``\<open>!\<close>'' option indicates extra
verbosity.
\<^descr> @{command "print_facts"} prints all local facts of the current context,
both named and unnamed ones; the ``\<open>!\<close>'' option indicates extra verbosity.
\<^descr> @{command "print_term_bindings"} prints all term bindings that are present in the context.
\<^descr> @{command "find_theorems"}~\<open>criteria\<close> retrieves facts from the theory or proofcontext matching all of given search criteria. The criterion \<open>name: p\<close>
selects all theorems whose fully qualified name matches pattern \<open>p\<close>, which
may contain ``\<open>*\<close>'' wildcards. The criteria \<open>intro\<close>, \<open>elim\<close>, and \<open>dest\<close>
select theorems that match the current goal as introduction, elimination or
destruction rules, respectively. The criterion \<open>solves\<close> returns all rules
that would directly solve the current goal. The criterion \<open>simp: t\<close> selects
all rewrite rules whose left-hand side matches the given term. The criterion term\<open>t\<close> selects all theorems that contain the pattern \<open>t\<close> -- as usual,
patterns may contain occurrences of the dummy ``\<open>_\<close>'', schematic variables, and type constraints.
Criteria can be preceded by ``\<open>-\<close>'' to select theorems that do \<^emph>\<open>not\<close> match. Note that giving the empty list of criteria yields \<^emph>\<open>all\<close> currently known
facts. An optional limit for the number of printed facts may be given; the
default is 40. By default, duplicates are removed from the search result. Use\<open>with_dups\<close> to display duplicates.
\<^descr> @{command "find_consts"}~\<open>criteria\<close> prints all constants whose type meets
all of the given criteria. The criterion \<open>strict: ty\<close> is met by any type
that matches the type pattern \<open>ty\<close>. Patterns may contain both the dummy type
``\<open>_\<close>'' and sort constraints. The criterion \<open>ty\<close> is similar, but it also
matches against subtypes. The criterion \<open>name: p\<close> and the prefix ``\<open>-\<close>'' function as described for @{command "find_theorems"}.
\<^descr> @{command "thm_deps"}~\<open>thms\<close> prints immediate theorem dependencies, i.e.\
the union of all theorems that are used directly to prove the argument
facts, without going deeper into the dependency graph.
\<^descr> @{command "unused_thms"}~\<open>A\<^sub>1 \<dots> A\<^sub>m - B\<^sub>1 \<dots> B\<^sub>n\<close> displays all theorems
that are proved in theories \<open>B\<^sub>1 \<dots> B\<^sub>n\<close> or their parents but not in \<open>A\<^sub>1 \<dots>
A\<^sub>m\<close> or their parents and that are never used. If \<open>n\<close> is \<open>0\<close>, the end of the
range of theories defaults to the current theory. If no range is specified,
only the unused theoremsin the current theory are displayed. \<close>
Die Informationen auf dieser Webseite wurden
nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit,
noch Qualität der bereit gestellten Informationen zugesichert.
Bemerkung:
Die farbliche Syntaxdarstellung ist noch experimentell.