text‹ Isabelle/ML is best understood as a certain culture based on Standard ML. Thus it is not a new programming language, but a certain way to use SML at an advanced level within the Isabelle environment. This covers a variety of aspects that are geared towards an efficient and robust platform for applications of formal logic with fully foundational proof construction --- according to the well-known 🪙‹LCF principle›.
infrastructure with library modules to address the needs of this difficult
task. For example, the raw parallel programming model of Poly/ML is
presented as considerably more abstract concept of 🪙‹futures›, which isthen
used to augment the inference kernel, Isar theoryandproof interpreter, and
PIDE document management.
The main aspects of Isabelle/ML are introduced below. These first-hand
explanations should helpto understand how proper Isabelle/ML isto be read and written, andto get access to the wealth of experience that is expressed in the source textand its history of changes.🪙‹See 🪙‹https://isabelle.in.tum.de/repos/isabelle›for the full Mercurial history. There are symbolic tags to refer to official Isabelle releases, as opposed to arbitrary 🪙‹tip›versions that merely reflect snapshots that are never really up-to-date.› ›
section‹Style and orthography›
text‹ The sources of Isabelle/Isar are optimized for 🪙‹readability› a 🪙‹maintainability›. The main purpose isto tell an informed reader what is
really going on and how things really work. This is a non-trivial aim, but
it is supported by a certain style of writing Isabelle/ML that has emerged from long years of system development.🪙‹See also the interesting style guide for OCaml 🪙‹https://caml.inria.fr/resources/doc/guides/guidelines.en.html› which shares many of our means and ends.›
The main principle behind any coding style is🪙‹consistency›. For a single
author of a small program this merely means ``choose your style and stick to
it''. A complex project like Isabelle, with long years of development and
different contributors, requires more standardization. A coding style that is changed every few years or with every new contributor is no style at all,
because consistency is quickly lost. Global consistency is hard to achieve,
though. Nonetheless, one should always strive at least forlocal consistency
of modules and sub-systems, without deviating from some general principles
how to write Isabelle/ML.
In a sense, good coding style is like an 🪙‹orthography›for the sources: it
helps to read quickly over the textand see through the main points, without
getting distracted by accidental presentation of free-style code. ›
subsection‹Header and sectioning›
text‹ Isabelle source files have a certain standardized header format (with precise spacing) that follows ancient traditions reaching back to the earliest versions of the system by Larry Paulson. See 🍋‹~~/src/Pure/thm.ML›,
The headerincludes at least 🍋‹Title›and🍋‹Author› entries, followed by a
prose description of the purpose of the module. The latter can range from a
single line to several paragraphs of explanations.
The rest of the fileis divided into chapters, sections, subsections,
subsubsections, paragraphs etc.\ using a simple layout via ML comments as
follows.
As in regular typography, there is some extra space 🪙‹before›section
headings that are adjacent to plain text, but not other headings as in the
example above.
🪙
The precise wording of the prose text given in these headings is chosen
carefully to introduce the main theme of the subsequent formal ML text. ›
subsection‹Naming conventions›
text‹ Since ML is the primary medium to express the meaning of the source text, naming of ML entities requires special care. ›
paragraph ‹Notation.› text‹ A name consists of 1--3 🪙‹words› ( by underscore. There are three variants concerning upper or lower case
letters, which are used for certain ML categories as follows:
🪙 \begin{tabular}{lll}
variant & example & ML categories \\hline
lower-case & 🪙‹foo_bar› & values, types, record fields java.lang.NullPointerException
capitalized & 🪙‹Foo_Bar› & datatype constructors, structures, functors java.lang.NullPointerException
upper-case & 🪙‹FOO_BAR› & special values, exception constructors, signatures java.lang.NullPointerException \end{tabular} 🪙
For historical reasons, many capitalized names omit underscores, e.g.java.lang.NullPointerException
old-style 🪙‹FooBar› instead of 🪙‹Foo_Bar›. Genuine
mixed-case names are 🪙‹not› used, because clear division of words is
essential for readability.🪙‹Camel-case was invented to workaround the lack of underscore in some early non-ASCII character sets. Later it became habitual in some language communities that are now strong in numbers.›
A single (capital) character does not count as ``word'' in this respect:
some Isabelle/ML names are suffixed by extra markers like this: 🪙‹foo_barT›.
Name variants are produced by adding 1--3 primes, e.g.\ 🪙‹foo'›, 🪙‹foo''›, or 🪙‹foo'''›, but not 🪙‹foo''''› or more.
Decimal digits scale better to larger numbers, e.g.\ 🪙‹foo0›, 🪙‹foo1›, 🪙‹foo42›. ›
paragraph ‹Scopes.› text‹ Apart from very basic library modules, ML structures are not ``opened'', but names are referenced with explicit qualification, as in 🪙‹Syntax.string_of_term› f
structures and their components it is important to aim at eye-catching
compositions of both parts, because this is how they are seen in the sources and documentation. For the same reasons, aliases of well-known library
functions should be avoided.
Local names of function abstraction or case/let bindings are typically
shorter, sometimes using only rudiments of ``words'', while still avoiding
cryptic shorthands. An auxiliary function called 🪙‹helper›, 🪙‹aux›, or 🪙‹f›is considered bad style.
Example:
@{verbatim [display] ‹ (* RIGHT *) fun print_foo ctxt foo = let fun print t = ... Syntax.string_of_term ctxt t ... in ... end; (* RIGHT *)
fun print_foo ctxt foo = let
val string_of_term = Syntax.string_of_term ctxt; fun print t = ... string_of_term t ... in ... end;
(* WRONG *)
val string_of_term = Syntax.string_of_term;
fun print_foo ctxt foo = let fun aux t = ... string_of_term ctxt t ... in ... end;›} ›
paragraph ‹Specific conventions.› text‹ Here are some specific name forms that occur frequently in the sources. 🪙 A function that maps 🪙‹foo› t 🪙‹foo_to_bar› or 🪙‹bar_of_foo› (never 🪙‹foo2bar›, nor 🪙‹bar_from_foo›, nor 🪙‹bar_for_foo›, nor 🪙‹bar4foo›).
🪙 The name component 🪙‹legacy› means that the operation is about to
be discontinued soon.
🪙 The name component 🪙‹global› means that this works with the
background theory instead of the regular localcontext
(\secref{sec:context}), sometimes for historical reasons, sometimes due a
genuine lack of locality of the concept involved, sometimes as a fall-back for the lack of a proper contextin the application code. Whenever there is
a non-global variant available, the application should be migrated touse it with a proper localcontext.
🪙 Variables of the main contexttypes of the Isabelle/Isar framework
(\secref{sec:context} and\chref{ch:local-theory}) have firm naming
conventions as follows:
🪙 theories are called 🪙‹thy›, rarely 🪙‹theory›
(never 🪙‹thry›)
🪙proof contexts are called 🪙‹ctxt›, rarely 🪙‹context›
(never 🪙‹ctx›)
🪙 generic contexts are called 🪙‹context›
🪙local theories are called 🪙‹lthy›, except forlocal
theories that are treated as proofcontext (which is a semantic
super-type)
Variations with primed or decimal numbers are always possible, as well as
semantic prefixes like 🪙‹foo_thy› or 🪙‹bar_ctxt›, but the
base conventions above need to be preserved. This allows to emphasize their
data flow via plain regular expressions in the text editor.
🪙 The main logical entities (\secref{ch:logic}) have established naming
convention as follows:
🪙 sorts are called 🪙‹S›
🪙types are called 🪙‹T›, 🪙‹U›, or 🪙‹ty› (never 🪙‹t›)
🪙 terms are called 🪙‹t›, 🪙‹u›, or 🪙‹tm› (never 🪙‹trm›)
🪙 certified types are called 🪙‹cT›, rarely 🪙‹T›, with
variants as fortypes
🪙 certified terms are called 🪙‹ct›, rarely 🪙‹t›, with
variants as for terms (never 🪙‹ctrm›)
🪙theorems are called 🪙‹th›, or 🪙‹thm›
Proper semantic names override these conventions completely. For example,
the left-hand side of an equation (as a term) can be called 🪙‹lhs›
(not 🪙‹lhs_tm›). Or a term that is known to be a variable can be
called 🪙‹v› or 🪙‹x›.
🪙 Tactics (\secref{sec:tactics}) are sufficiently important tohave specific
naming conventions. The name of a basic tactic definition always has a 🪙‹_tac› suffix, the subgoal index (if applicable) is always called 🪙‹i›, and the goal state (if made explicit) is usually called 🪙‹st› instead of the somewhat misleading 🪙‹thm›. Any other
arguments are given before the latter two, and the general contextis given
first. Example:
@{verbatim [display] ‹ fun my_tac ctxt arg1 arg2 i st = ...›}
Note that the goal state 🪙‹st› above is rarely made explicit, if
tactic combinators (tacticals) are used as usual.
A tactic that requires a proofcontext needs to make that explicit as seen in the 🍋‹ctxt› argument above. Do not refer to the background theory of 🍋‹st› -- it is not a proper context, but merely a formal certificate. ›
subsection‹General source layout›
text‹ The general Isabelle/ML source layout imitates regular type-setting conventions, augmented by the requirements for deeply nested expressions that are commonplace in functional programming. ›
paragraph ‹Line length› text‹ is limited to 80 characters according to ancient standards, but we allow as much as 100 characters (not more).🪙‹Readability requires to keep the beginning of a line in view while watching its end. Modern wide-screen displays do not change the way how the human brain works. Sources also need to be printable on plain paper with reasonable font-size.›
characters acknowledge the space requirements due to qualified library
references in Isabelle/ML. ›
paragraph ‹White-space› text‹ is used to emphasize the structure of expressions, following mostly standard conventions for mathematical typesetting, as can be seen in plain {\TeX} or {\LaTeX}. This defines positioning of spaces for parentheses, punctuation, and infixes as illustrated here: @{verbatim [display] ‹ val x = y + z * (a + b); val pair = (a, b); val record = {foo = 1, bar = 2};›}
Lines are normally broken 🪙‹after› an infix operator or punctuation
character. For example:
@{verbatim [display] ‹ val x = a + b + c; val tuple = (a, b, c); ›}
Some special infixes (e.g.\ 🪙‹|>›) work better at the start of the
line, but punctuation is always at the end.
Function application follows the tradition of ‹λ›-calculus, not informal
mathematics. For example: 🪙‹f a b›for a curried function, or 🪙‹g (a, b)›for a tupled function. Note that the space between 🪙‹g›and the pair 🪙‹(a, b)› follows the important
principle of 🪙‹compositionality›: the layout of 🪙‹g p› does not
change when 🪙‹p›is refined to the concrete pair 🪙‹(a, b)›. ›
paragraph ‹Indentation› text‹ uses plain spaces, never hard tabulators.🪙‹Tabulators were invented to move the carriage of a type-writer to certain predefined positions. In software they could be used as a primitive run-length compression of consecutive spaces, but the precise result would depend on non-standardized text editor configuration.›
Each level of nesting is indented by 2 spaces, sometimes 1, very rarely 4,
never 8 or any other odd number.
Indentation follows a simple logical format that only depends on the nesting
depth, not the accidental length of the text that initiates a level of
nesting. Example:
@{verbatim [display] ‹ (* RIGHT *) if b then expr1_part1 expr1_part2 else expr2_part1 expr2_part2 (* WRONG *)
if b then expr1_part1
expr1_part2
else expr2_part1
expr2_part2›}
The second form has many problems: it assumes a fixed-width font when
viewing the sources, it uses more space on the line andthus makes it hard to observe its strict length limit (working against 🪙‹readability›), it
requires extra editing to adapt the layout to changes of the initial text
(working against 🪙‹maintainability›) etc.
🪙 For similar reasons, any kind of two-dimensional or tabular layouts,
ASCII-art with lines or boxes of asterisks etc.\ should be avoided. ›
paragraph ‹Complex expressions› text‹ that consist of multi-clausal function definitions, 🪙‹handle›, 🪙‹case›, 🪙‹let› (and combinations) require special
attention. The syntax of Standard ML is quite ambitious and admits a lot of
variance that can distort the meaning of the text.
Multiple clauses of 🪙‹fun›, 🪙‹fn›, 🪙‹handle›, 🪙‹case› get extra indentation to indicate the nesting clearly.
Example:
@{verbatim [display] ‹ (* RIGHT *) fun foo p1 = expr1 | foo p2 = expr2 (* WRONG *)
fun foo p1 =
expr1
| foo p2 =
expr2›}
Body expressions consisting of 🪙‹case› or 🪙‹let› require
care to maintain compositionality, to prevent loss of logical indentation where it is especially important to see the structure of the text. Example:
@{verbatim [display] ‹ (* RIGHT *) fun foo p1 = (case e of q1 => ... | q2 => ...) | foo p2 = let ... in ... end (* WRONG *)
fun foo p1 = case e of
q1 => ...
| q2 => ...
| foo p2 = let
... in
... end›}
Extra parentheses around 🪙‹case› expressions are optional, but help to analyse the nesting based on character matching in the text editor.
🪙
There are two main exceptions to the overall principle of compositionality in the layout of complex expressions.
🪙🪙‹if› expressions are iterated as if ML had multi-branch
conditionals, e.g.
@{verbatim [display] ‹ (* RIGHT *) if b1 then e1 else if b2 then e2 else e3›}
🪙🪙‹fn› abstractions are often layed-out as if they would lack any structureby themselves. This traditional form is motivated by the
possibility to shift function arguments backand forth wrt.\ additional
combinators. Example:
@{verbatim [display] ‹ (* RIGHT *) fun foo x y = fold (fn z => expr)›}
Here the visual appearance is that of three arguments 🪙‹x›, 🪙‹y›, 🪙‹z›in a row.
Such weakly structured layout should be usewith great care. Here are some
counter-examples involving 🪙‹let› expressions:
@{verbatim [display] ‹ (* WRONG *) fun foo x = let val y = ... in ... end (* WRONG *)
fun foo x = let
val y = ... in ... end
(* WRONG *)
fun foo x = let
val y = ... in ... end
(* WRONG *)
fun foo x = let
val y = ... in
... end›}
🪙 In general the source layout is meant to emphasize the structure of complex
language expressions, not to pretend that SML had a completely different syntax (say that of Haskell, Scala, Java). ›
section‹ML embedded into Isabelle/Isar›
text‹ ML and Isar are intertwined via an open-ended bootstrap process that provides more and more programming facilities and logical content in an alternating manner. Bootstrapping starts from the raw environment of existing implementations of Standard ML (mainly Poly/ML). Isabelle/Pure marks the point where the raw ML toplevel is superseded by Isabelle/ML within the Isar theory and proof language, with a uniform context for arbitrary ML values (see also \secref{sec:context}). This formal environment holds ML compiler bindings, logical entities, and many other things. Object-logics like Isabelle/HOL are built within the Isabelle/ML/Isar environment by introducing suitable theories with associated ML modules, either inlined within 🍋‹.thy› f
loading from some theory. Thus Isabelle/HOL is defined as a regular
user-space application within the Isabelle framework. Further add-on tools
can be implemented in ML within the Isar contextin the same manner: ML is
part of the standard repertoire of Isabelle, and there is no distinction
between ``users'' and ``developers'' in this respect. ›
subsection‹Isar ML commands›
text‹ The primary Isar source language provides facilities to ``open a window'' to the underlying ML compiler. Especially see the Isar commands @{command_ref "ML_file"} and @{command_ref "ML"}: both work the same way, but the source text is provided differently, via a file vs.\ inlined, respectively. Apart from embedding ML into the main theory definition like that, there are many more commands that refer to ML source, such as @{command_ref setup} or @{command_ref declaration}. Even more fine-grained embedding of ML into Isar is encountered in the proof method @{method_ref tactic}, which refines the pending goal state via a given expression of type 🪙‹tactic›. ›
text %mlex ‹ The following artificial example demonstrates some ML toplevel declarations within the implicit Isar theory context. This is regular functional programming without referring to logical entities yet. ›
ML ‹ fun factorial 0 = 1 | factorial n = n * factorial (n - 1) ›
text‹ Here the ML environment is already managed by Isabelle, i.e.\ the 🪙‹factorial› f
nor in a different theory that is independent from the current one in the
import hierarchy.
Removing the above ML declarationfrom the source text will remove any trace
of this definition, as expected. The Isabelle/ML toplevel environment is
managed in a 🪙‹stateless› way: in contrast to the raw ML toplevel, there are
no global side-effects involved here.🪙‹Such a stateless compilation environment is also a prerequisite for robust parallel compilation within independent nodes of the implicit theory development graph.›
🪙
The next example shows how to embed ML into Isar proofs, using @{command_ref "ML_prf"} instead of @{command_ref "ML"}. As illustrated below, the effect
on the ML environment islocalto the whole proof body, but ignoring the
block structure. ›
notepad begin
ML_prf %"ML"‹val a = 1›
{
ML_prf %"ML"‹val b = a + 1›
} 🍋‹Isar block structure ignored by ML environment›
ML_prf %"ML"‹val c = b + 1› end
text‹ By side-stepping the normal scoping rules for Isar proof blocks, embedded ML code can refer to the different contexts and manipulate corresponding entities, e.g.\ export a fact from a block context. 🪙 Two further ML commands are useful in certain situations: @{command_ref ML_val} and @{command_ref ML_command} are 🪙‹diagnostic› i
there is no effect on the underlying environment, and can thus be used
anywhere. The examples below produce long strings of digits by invoking 🪙‹factorial›: @{command ML_val} takes care of printing the ML toplevel
result, but @{command ML_command} is silent so we produce an explicit output
message. ›
notepad begin
ML_val ‹factorial 100›
ML_command ‹writeln (string_of_int (factorial 100))› end
subsection‹Compile-time context›
text‹ Whenever the ML compiler is invoked within Isabelle/Isar, the formal context is passed as a thread-local reference variable. Thus ML code may access the theory context during compilation, by reading or writing the (local) theory under construction. Note that such direct access to the compile-time context is rare. In practice it is typically done via some derived ML functions instead. ›
text %mlref ‹ \begin{mldecls} @{define_ML Context.the_generic_context: "unit -> Context.generic"} \\ @{define_ML "Context.>>": "(Context.generic -> Context.generic) -> unit"} \\ @{define_ML ML_Thms.bind_thms: "string * thm list -> unit"} \\ @{define_ML ML_Thms.bind_thm: "string * thm -> unit"} \\ \end{mldecls} 🪙🪙‹Context.the_generic_context ()› r
the ML toplevel --- at compile time. ML code needs to take care to refer to 🪙‹Context.the_generic_context ()› correctly. Recall that evaluation
of a function body is delayed until actual run-time.
🪙🪙‹Context.>>›~‹f› applies context transformation ‹f›to the implicit context of the ML toplevel.
🪙🪙‹ML_Thms.bind_thms›~‹(name, thms)› stores a list of theorems produced in ML both in the (global) theorycontextand the ML toplevel, associating
it with the provided name.
🪙🪙‹ML_Thms.bind_thm›is similar to🪙‹ML_Thms.bind_thms› but refers to
a singleton fact.
It is important tonote that the above functions are really restricted to
the compile time, even though the ML compiler is invoked at run-time. The
majority of ML code either uses static antiquotations
(\secref{sec:ML-antiq}) or refers to the theory or proofcontext at
run-time, by explicit functional abstraction. ›
subsection‹Antiquotations \label{sec:ML-antiq}›
text‹ A very important consequence of embedding ML into Isar is the concept of 🪙‹ML antiquotation›.
special syntactic entities of the following form:
🪙‹ @{syntax_def antiquote}: '@{' name args '}' ›
Here @{syntax name} and @{syntax args} are outer syntax categories, as
defined in🍋‹"isabelle-isar-ref"›.
🪙
A regular antiquotation ‹@{name args}› processes its arguments by the usual
means of the Isar source language, and produces corresponding ML source text, either as literal 🪙‹inline›text (e.g.\ ‹@{term t}›) or abstract 🪙‹value› (e.g. ‹@{thm th}›). This pre-compilation scheme allows to refer to
formal entities in a robust manner, with proper static scoping andwith some
degree of logical checking of small portions of the code. ›
subsection‹Printing ML values›
text‹ The ML compiler knows about the structure of values according to their static type, and can print them in the manner of its toplevel, although the details are non-portable. The antiquotations @{ML_antiquotation_def "make_string"} and @{ML_antiquotation_def "print"} provide a quasi-portable way to refer to this potential capability of the underlying ML system in generic Isabelle/ML sources. This is occasionally useful for diagnostic or demonstration purposes. Note that production-quality tools require proper user-level error messages, avoiding raw ML values in the output. ›
🪙‹@{make_string}› inlines a functionto print arbitrary values similar to
the ML toplevel. The result is compiler dependent and may fall back on "?" in certain situations. The value of configuration option @{attribute_ref
ML_print_depth} determines further details of output.
🪙‹@{print f}›uses the ML function‹f: string -> unit›tooutput the result
of ‹@{make_string}› above, together with the source position of the
antiquotation. The default outputfunctionis🪙‹writeln›. ›
text %mlex ‹ The following artificial examples show how to produce adhoc output of ML values for debugging purposes. ›
ML_val ‹ val x = 42; val y = true; writeln (🍋 {x = x, y = y}); 🍋 {x = x, y = y}; 🍋‹tracing› { ›
section‹Canonical argument order \label{sec:canonical-argument-order}›
text‹ Standard ML is a language in the tradition of ‹λ›-c 🪙‹higher-order functional programming›, similar to OCaml, Haskell, or
Isabelle/Pure and HOL as logical languages. Getting acquainted with the
native style of representing functions in that setting can save a lot of
extra boiler-plate of redundant shuffling of arguments, auxiliary
abstractions etc.
Functions are usually 🪙‹curried›: the idea of turning arguments of type ‹τ🪙i› (for‹i ∈ {1, … n}›) into a result of type ‹τ›is represented by the
iterated function space ‹τ🪙1 →…→ τ🪙n → τ›. This is isomorphic to the
well-known encoding via tuples ‹τ🪙1 ×…× τ🪙n → τ›, but the curried version
fits more smoothly into the basic calculus.🪙‹The difference is even more significant in HOL, because the redundant tuple structure needs to be accommodated extraneous proof steps.›
Currying gives some flexibility due to🪙‹partial application›. A function ‹f: τ🪙1 → τ🪙2 → τ› can be applied to‹x: τ🪙1›and the remaining ‹(f x): τ🪙2 → τ›
on the order of arguments. In the worst case, arguments are arranged
erratically, andusing a functionin a certain situation always requires
some glue code. Thus we would get exponentially many opportunities to
decorate the code with meaningless permutations of arguments.
This can be avoided by🪙‹canonical argument order›, which observes certain
standard patterns and minimizes adhoc permutations in their application. In
Isabelle/ML, large portions of text can be written without auxiliary
operations like ‹swap: α × β → β × α› or ‹C: (α → β → γ) → (β → α → γ)› (the
latter is not present in the Isabelle/ML library).
🪙
The main idea is that arguments that vary less are moved further to the left
than those that vary more. Two particularly important categories of
functions are 🪙‹selectors›and🪙‹updates›.
The subsequent scheme is based on a hypothetical set-like container of type ‹β› that manages elements of type ‹α›. Both the names andtypes of the
associated operations are canonical for Isabelle/ML.
\begin{center} \begin{tabular}{ll}
kind & canonical name and type \\hline
selector & ‹member: β → α → bool› java.lang.NullPointerException
update & ‹insert: α → β → β› java.lang.NullPointerException \end{tabular} \end{center}
Given a container ‹B: β›, the partially applied ‹member B›is a predicate
over elements ‹α → bool›, andthus represents the intended denotation
directly. It is customary to pass the abstract predicate to further
operations, not the concrete container. The argument order makes it easy to use other combinators: ‹forall (member B) list› will check a list of
elements for membership in‹B› etc. Often the explicit ‹list›is pointless and can be contracted to‹forall (member B)›to get directly a predicate
again.
In contrast, an update operation varies the container, so it moves to the
right: ‹insert a›is a function‹β → β›to insert a value‹a›. These can be
composed naturally as ‹insert c ∘ insert b ∘ insert a›. The slightly awkward
inversion of the composition order is due to conventional mathematical notation, which can be easily amended as explained below. ›
subsection‹Forward application and composition›
text‹ Regular function application and infix notation works best for relatively deeply structured expressions, e.g.\ ‹h (f x y + g z)›.
special case of 🪙‹linear transformation› applies a cascade of functions ‹f🪙n (… (f🪙1 x))›.
themselves given as complex expressions. The notation can be significantly
improved by introducing 🪙‹forward› versions of application and composition
as follows:
text‹ As explained above, a function ‹f: α → β → β› c
a configuration of type ‹β›, parameterized by an argument of type ‹α›. Given ‹a: α› the partial application ‹(f a): β → β› operates homogeneously on ‹β›.
This can be iterated naturally over a list of parameters ‹[a🪙1, …, a🪙n]› as ‹f a🪙1 #> … #> f a🪙n›. The latter expression is again a function‹β → β›. It
can be applied to an initial configuration ‹b: β›to start the iteration
over the given list of arguments: each ‹a›in‹a🪙1, …, a🪙n›is applied
consecutively by updating a cumulative configuration.
The ‹fold› combinator in Isabelle/ML lifts a function‹f› as above to its
iterated version over a list of arguments. Lifting can be repeated, e.g.java.lang.NullPointerException ‹(fold ∘ fold) f› iterates over a list of lists as expected.
The variant ‹fold_rev› works inside-out over the list of arguments, such
that ‹fold_rev f ≡ fold f ∘ rev› holds.
The ‹fold_map› combinator essentially performs ‹fold›and‹map›
simultaneously: each application of ‹f› produces an updated configuration
together with a side-result; the iteration collects all such side-results as
a separate list. ›
text %mlref ‹ \begin{mldecls} @{define_ML fold: "('a -> 'b -> 'b) -> 'a list -> 'b -> 'b"} \\ @{define_ML fold_rev: "('a -> 'b -> 'b) -> 'a list -> 'b -> 'b"} \\ @{define_ML fold_map: "('a -> 'b -> 'c * 'b) -> 'a list -> 'b -> 'c list * 'b"} \\ \end{mldecls} 🪙🪙‹fold›~\
parameters.
🪙🪙‹fold_rev›~‹f›is similar to🪙‹fold›~‹f›, but works inside-out, as if the list would be reversed.
🪙🪙‹fold_map›~‹f› lifts the parametrized update function‹f› (with
side-result) to a list of parameters and cumulative side-results.
\begin{warn}
The literature on functional programming provides a confusing multitude of
combinators called ‹foldl›, ‹foldr› etc. SML97 provides its own variations
as 🪙‹List.foldl›and🪙‹List.foldr›, while the classic Isabelle library also has the historic 🪙‹Library.foldl›and🪙‹Library.foldr›. To avoid
unnecessary complication, all these historical versions should be ignored, and the canonical 🪙‹fold› (or 🪙‹fold_rev›) used exclusively. \end{warn} ›
text %mlex ‹ The following example shows how to fill a text buffer incrementally by adding strings, either individually or from a given list. ›
ML_val ‹ val s = Buffer.empty |> Buffer.add "digits: " |> fold (Buffer.add o string_of_int) (0 upto 9) |> Buffer.content; 🍋 (s = "digits: 0123456789"); ›
text‹ Note how 🪙‹fold (Buffer.add o string_of_int)› a 🪙‹map› over the given list. This kind of peephole optimization reduces
both the code size and the tree structures in memory (``deforestation''),
but it requires some practice to read and write fluently.
🪙
The next example elaborates the idea of canonical iteration, demonstrating
fast accumulation of tree content using a text buffer. ›
ML ‹ datatype tree = Text of string | Elem of string * tree list; fun slow_content (Text txt) = txt | slow_content (Elem (name, ts)) = "🚫^ name ^ ">" ^ implode (map slow_content ts) ^ "🚫 ^ name ^ ">" fun add_content (Text txt) = Buffer.add txt | add_content (Elem (name, ts)) = Buffer.add ("🚫^ name ^ ">") #> fold add_content ts #> Buffer.add ("🚫 ^ name ^ ">"); fun fast_content tree = Buffer.empty |> add_content tree |> Buffer.content; ›
text‹ The slowness of 🪙‹slow_content› i
recursive results, because it copies previously produced strings again and
again.
The incremental 🪙‹add_content›avoids this by operating on a buffer that is passed through in a linear fashion. Using🪙‹#>›and contraction
over the actual buffer argument saves some additional boiler-plate. Of
course, the two 🪙‹Buffer.add› invocations with concatenated strings
could have been split into smaller parts, but this would have obfuscated the
source without making a big difference in performance. Here we havedone
some peephole-optimization for the sake of readability.
Another benefit of 🪙‹add_content›is its ``open'' form as a function on
buffers that can be continued in further linear transformations, folding
etc. Thus it is more compositional than the naive 🪙‹slow_content›. As
realistic example, compare the old-style 🪙‹Term.maxidx_of_term: term -> int›with the newer 🪙‹Term.maxidx_term: term -> int -> int›in Isabelle/Pure.
Note that 🪙‹fast_content› above is only defined as example. In many
practical situations, it is customary to provide the incremental 🪙‹add_content› only and leave the initialization andterminationto the
concrete application to the user. ›
text‹ Isabelle provides output channels for different kinds of messages: regular output, high-volume tracing information, warnings, and errors. Depending on the user interface involved, these messages may appear in different text styles or colours. The standard output for batch sessions prefixes each line of warnings by 🍋‹###› a
anything else unchanged. The message body may contain further markup and
formatting, which is routinely used in the Prover IDE 🍋‹"isabelle-jedit"›.
Messages are associated with the transaction context of the running Isar
command. This enables the front-end to manage commands and resulting
messages together. For example, after deleting a command from a given theory
document version, the corresponding message output can be retracted from the
display. ›
text %mlref ‹ \begin{mldecls} @{define_ML writeln: "string -> unit"} \\ @{define_ML tracing: "string -> unit"} \\ @{define_ML warning: "string -> unit"} \\ @{define_ML error: "string -> 'a"} % FIXME Output.error_message (!?) \\ \end{mldecls} 🪙🪙‹writeln›~\
primary message output operation of Isabelle and should be used by default.
🪙🪙‹tracing›~‹text›outputs‹text› as special tracing message, indicating
potential high-volume outputto the front-end (hundreds or thousands of
messages issued by a single command). The idea isto allow the
user-interface to downgrade the quality of message display to achieve higher
throughput.
Note that the user might haveto take special actionsto see tracing output,
e.g.\ switch to a different output window. So this channel should not be
used for regular output.
🪙🪙‹warning›~‹text›outputs‹text› as warning, which typically means some
extra emphasis on the front-end side (color highlighting, icons, etc.).
🪙🪙‹error›~‹text› raises exception 🪙‹ERROR›~‹text›andthus lets the
Isar toplevel print ‹text› on the error channel, which typically means some
extra emphasis on the front-end side (color highlighting, icons, etc.).
This assumes that the exception is not handled before the command
terminates. Handling exception 🪙‹ERROR›~‹text›is a perfectly legal
alternative: it means that the error is absorbed without any message output.
\begin{warn}
The actual error channel is accessed via 🪙‹Output.error_message›, but
this is normally not used directly in user code. \end{warn}
\begin{warn}
Regular Isabelle/ML code should output messages exclusively by the official
channels. Using raw I/O on 🪙‹stdout› or 🪙‹stderr› instead (e.g.\ via 🪙‹TextIO.output›) is apt to cause problems in the presence of parallel and asynchronous processing of Isabelle theories. Such raw output might be
displayed by the front-end in some system console log, with a low chance
that the user will ever see it. Moreover, as a genuine side-effect on global
process channels, there is no proper way to retract output when Isar command
transactions are reset by the system. \end{warn}
\begin{warn}
The message channels should be used in a message-oriented manner. This means
that multi-line output that logically belongs together is issued by a single
invocation of 🪙‹writeln› etc.\ with the functional concatenation of all
message constituents. \end{warn} ›
text %mlex ‹ The following example demonstrates a multi-line warning. Note that in some situations the user sees only the first line, so the most important point should be made first. ›
ML_command ‹ warning (cat_lines ["Beware the Jabberwock, my son!", "The jaws that bite, the claws that catch!", "Beware the Jubjub Bird, and shun", "The frumious Bandersnatch!"]); ›
text‹ 🪙 An alternative is to make a paragraph of freely-floating words as follows. ›
text‹ This has advantages with variable window / popup sizes, but might make it harder to search for message content systematically, e.g.\ by other tools or by humans expecting the ``verse'' of a formal message in a fixed layout. ›
section‹Exceptions \label{sec:exceptions}›
text‹ The Standard ML semantics of strict functional evaluation together with exceptions is rather well defined, but some fine points need to be observed to avoid that ML programs go wrong despite static type-checking. Unlike official Standard ML, Isabelle/ML rejects catch-all patterns in 🪙‹handle› c
especially against arbitrary physical events (interrupts).
Exceptions in Isabelle/ML are subsequently categorized as follows. ›
paragraph ‹Regular user errors.› text‹ These are meant to provide informative feedback about malformed input etc. The 🪙‹error› f
plain text message as argument. 🪙‹ERROR› exceptions can be handled
internally, in order to be ignored, turned into other exceptions, or
cascaded by appending messages. If the corresponding Isabelle/Isar command
terminates with an 🪙‹ERROR› exception state, the system will print the
result on the error channel (see \secref{sec:message-channels}).
It is considered bad style to refer to internal function names or values in
ML source notationin user error messages. Do not use‹@{make_string}› nor ‹@{here}›!
Grammatical correctness of error messages can be improved by🪙‹omitting›
final punctuation: messages are often concatenated or put into a larger context (e.g.\ augmented with source position). Note that punctuation after
formal entities (types, terms, theorems) is particularly prone to user
confusion. ›
paragraph ‹Program failures.› text‹ There is a handful of standard exceptions that indicate general failure situations (e.g.\ 🪙‹Fail›),
entities (types, terms, theorems, theories, see \chref{ch:logic}).
These exceptions indicate a genuine breakdown of the program, so the main
purpose isto determine quickly what has happened in the ML program.
Traditionally, the (short) exception message would include the name of an ML function, although this is not strictly necessary, because the ML runtime
system attaches detailed source position stemming from the corresponding 🪙‹raise› keyword.
🪙
User modules can always introduce their own custom exceptions locally, e.g.java.lang.NullPointerException to organize internal failures robustly without overlapping with existing
exceptions. Exceptions that are exposed in module signatures require extra
care, though, and should 🪙‹not› be introduced by default. Surprise by users
of a module can be often minimized byusing plain user errors instead. ›
paragraph ‹Interrupts.› text‹ These indicate arbitrary system events: both the ML runtime system and the Isabelle/ML infrastructure may signal various exceptional situations by raising special exceptions user code, satisfying the predicate 🪙‹Exn.is_interrupt›.
This is the one and only way that physical events can intrude an Isabelle/ML
program. Such an interrupt can mean out-of-memory, stack overflow, timeout,
internal signaling of threads, or a POSIX process signal. An Isabelle/ML
program that intercepts interrupts becomes dependent on physical effects of
the environment (e.g.\ via 🪙‹Exn.capture› without subsequent 🪙‹Exn.release›).
Note that the original SML90 language had an 🍋‹Interrupt› exception, but
that was excluded from SML97 to simplify ML the mathematical semantics.
Isabelle/ML does support physical interrupts thanks to special features of
the underlying Poly/ML compiler and runtime system. This works robustly,
because the old 🪙‹Interrupt› constructor has been removed from the
ML environment, and catch-all patterns 🪙‹handle› are rejected. Thus user code becomes strictly transparent wrt.\ interrupts: physical
events are exposed to the toplevel, and the mathematical meaning of the
program becomes a partial function that isotherwise unchanged.
The Isabelle/ML antiquotation @{ML_antiquotation try}, with its syntactic
variants for🪙‹catch› or 🪙‹finally›, supports
intermediate handling of interrupts and subsequent cleanup-operations,
without swallowing such physical event. ›
text %mlref ‹ \begin{mldecls} @{define_ML try: "('a -> 'b) -> 'a -> 'b option"} \\ @{define_ML can: "('a -> 'b) -> 'a -> bool"} \\ @{define_ML_exception ERROR of string} \\ @{define_ML_exception Fail of string} \\ @{define_ML Exn.is_interrupt: "exn -> bool"} \\ @{define_ML Exn.reraise: "exn -> 'a"} \\ @{define_ML Runtime.exn_trace: "(unit -> 'a) -> 'a"} \\ \end{mldecls} 🪙🪙‹try›~\
the option datatype. Interrupts are 🪙‹not› handled here, i.e.\ this form
serves as safe replacement for the 🪙‹fragile› version 🪙‹(SOME›~‹f x›~
SML97.
🪙🪙‹can›is similar to🪙‹try›with more abstract result.
🪙🪙‹ERROR›~‹msg› represents user errors; this exception is normally
raised indirectly via the 🪙‹error›function (see \secref{sec:message-channels}).
🪙🪙‹Fail›~‹msg› represents general program failures, but not user errors.
🪙🪙‹Exn.is_interrupt› identifies interrupts, without mentioning
concrete exception constructors in user code. Since 🪙‹handle›with
catch-all patterns is rejected, it cannot handle interrupts at all. In the
rare situations where this is really required, use🪙‹Exn.capture›and 🪙‹Exn.release› (\secref{sec:managed-eval}).
🪙🪙‹Exn.reraise›~‹exn› raises exception ‹exn› while preserving its implicit
position information (if possible, depending on the ML platform).
🪙🪙‹Runtime.exn_trace›~🪙‹(fn () =>›~‹e›\<^ML_text>‹)› evaluates
expression ‹e› while printing a full trace of its stack of nested exceptions
(if possible, depending on the ML platform).
Inserting 🪙‹Runtime.exn_trace› into ML code temporarily is useful for
debugging, but not suitable for production code. ›
🪙‹@{try}›and‹{can}› take embedded ML source as arguments, and modify the
evaluation analogously to the combinators 🪙‹try›and🪙‹can› above,
but with special treatment of the interrupt state of the underlying ML
thread. There are also variants to support 🍋‹try_catch›and🍋‹try_finally›
blocks similar to Scala.
The substructure of the embedded argument supports the following syntax
variants:
The @{syntax handler} of 🍋‹try_catch› follows the syntax of 🪙‹fn›
patterns, so it is similar to🪙‹handle›: the key difference is
that interrupts are always passed-through via 🪙‹Exn.reraise›.
The @{syntax cleanup} expression of 🍋‹try_finally›is always invoked,
regardless of the overall exception result, and afterwards exceptions are
passed-through via 🪙‹Exn.reraise›.
Both the @{syntax handler} and @{syntax cleanup} are evaluated with further
interrupts disabled! These expressions should terminate promptly; timeouts
don't work here.
🪙
Implementation details can be seen in🪙‹Isabelle_Thread.try_catch›, 🪙‹Isabelle_Thread.try_finally›, 🪙‹Isabelle_Thread.try›, and 🪙‹Isabelle_Thread.can›, respectively. The ML antiquotations add
functional abstractions as required for these ``special forms'' of
Isabelle/ML.
🪙‹@{assert}› inlines a function🪙‹bool -> unit› that raises 🪙‹Fail›if the argument is🪙‹false›. Due to inlining the source
position of failed assertions is included in the error output.
🪙‹@{undefined}› inlines 🍋‹raise›~🪙‹Match›, i.e.\ the ML program
behaves as in some function application of an undefined case. ›
text %mlex ‹ We define total versions of division: any failures are swept under the carpet and mapped to a default value. Thus division-by-zero becomes 0, but there could be other exceptions like overflow that produce the same result. For unbounded integers such side-errors do not happen, but it might still be better to be explicit about exception patterns (second version below). ›
ML ‹ fun div_total1 x y = 🍋‹x div y catch _ => 0›; fun div_total2 x y = 🍋‹x div y catch Div => 0›;
🍋 (div_total1 1 0 = 0); 🍋 (div_total2 1 0 = 0); ›
text‹ The ML function 🪙‹undefined› i
as follows: ›
ML ‹fun undefined _ = raise Match›
text‹ 🪙 The following variant uses the antiquotation @{ML_antiquotation undefined} instead: ›
ML ‹fun undefined _ = @{undefined}›
text‹ 🪙 Here is the same, using control-symbol notation for the antiquotation, with special rendering of 🍋‹🍋›: ›
ML ‹fun undefined _ = 🍋›
text‹ 🪙 Semantically, all forms are equivalent to a function definition without any clauses, but that is syntactically not allowed in ML. ›
section‹Strings of symbols \label{sec:symbols}›
text‹ A 🪙‹symbol› c
characters are normally not encountered at all. Isabelle strings consist of
a sequence of symbols, represented as a packed string or an exploded list of
strings. Each symbol isin itself a small string, which has either one of
the following forms:
🪙 a single ASCII character ``‹c›'', for example ``🍋‹a›'',
🪙 a codepoint according to UTF-8 (non-ASCII byte sequence),
🪙 a regular symbol ``🍋‹🪙›'', for example ``🍋‹α›'',
🪙 a control symbol ``🍋‹🪙›'', for example ``🍋‹🪙›'',
The ‹ident›syntaxfor symbol names is‹letter (letter | digit)🪙*›, where ‹letter = A..Za..z›and‹digit = 0..9›. There are infinitely many regular
symbols and control symbols, but a fixed collection of standard symbols is
treated specifically. For example, ``🍋‹α›'' is classified as a letter, which
means it may occur within regular Isabelle identifiers.
The character set underlying Isabelle symbols is 7-bit ASCII, but 8-bit
character sequences are passed-through unchanged. Unicode/UCS data in UTF-8
encoding is processed in a non-strict fashion, such that well-formed code
sequences are recognized accordingly. Unicode provides its own collection of
mathematical symbols, but within the core Isabelle/ML world there is no link to the standard collection of Isabelle regular symbols.
🪙 Output of Isabelle symbols depends on the print mode. For example, the
standard {\LaTeX} setup of the Isabelle document preparation system would
present ``🍋‹α›'' as ‹α›, and ``🍋‹🪙α›'' as ‹🪙α›. On-screen rendering usually
works by mapping a finite subset of Isabelle symbols to suitable Unicode
characters. ›
🪙🪙‹Symbol.explode›~‹str› produces a symbol list from the packed form.
This function supersedes 🪙‹String.explode›for virtually all purposes
of manipulating textin Isabelle!🪙‹The runtime overhead for exploded strings is mainly that of the list structure: individual symbols that happen to be a singleton string do not require extra memory in Poly/ML.›
🪙🪙‹Symbol.is_letter›, 🪙‹Symbol.is_digit›, 🪙‹Symbol.is_quasi›, 🪙‹Symbol.is_blank› classify standard symbols
according to fixed syntactic conventions of Isabelle, cf.\ 🍋‹"isabelle-isar-ref"›.
🪙 Type 🪙‹Symbol.sym›is a concrete datatype that represents the
different kinds of symbols explicitly, with constructors 🪙‹Symbol.Char›, 🪙‹Symbol.UTF8›, 🪙‹Symbol.Sym›, 🪙‹Symbol.Control›, 🪙‹Symbol.Malformed›.
🪙🪙‹Symbol.decode› converts the string representation of a symbol into
the datatype version. ›
paragraph ‹Historical note.› text‹ In the original SML90 standard the primitive ML type 🪙‹char› d
exists, and🪙‹explode: string -> string list› produced a list of
singleton strings like 🪙‹raw_explode: string -> string list›in
Isabelle/ML today. When SML97 came out, Isabelle did not adopt its somewhat
anachronistic 8-bit or 16-bit characters, but the idea of exploding a string
into a list of small strings was extended to ``symbols'' as explained above. Thus Isabelle sources can refer to an infinite store of user-defined
symbols, without having to worry about the multitude of Unicode encodings
that have emerged over the years. ›
section‹Basic data types›
text‹ The basis library proposal of SML97 needs to be treated with caution. Many of its operations simply do not fit with important Isabelle/ML conventions (like ``canonical argument order'', see \secref{sec:canonical-argument-order}), others cause problems with the parallel evaluation model of Isabelle/ML (such as 🪙‹TextIO.print› o 🪙‹OS.Process.system›).
Subsequently we give a brief overview of important operations on basic ML
data types. ›
subsection‹Characters›
text %mlref ‹ \begin{mldecls} @{define_ML_type char} \\ \end{mldecls} 🪙 Type 🪙‹char› i is represented as a ``symbol'' (see \secref{sec:symbols}). ›
subsection‹Strings›
text %mlref ‹ \begin{mldecls} @{define_ML_type string} \\ \end{mldecls} 🪙 Type 🪙‹string› r
There are operations in SML to convert backand forth to actual byte
vectors, which are seldom used.
This historically important raw text representation is used for
Isabelle-specific purposes with the following implicit substructures packed
into the string content:
🪙 sequence of Isabelle symbols (see also\secref{sec:symbols}), with 🪙‹Symbol.explode› as key operation;
🪙 XML tree structure via YXML (see also🍋‹"isabelle-system"›), with 🪙‹YXML.parse_body› as key operation.
Note that Isabelle/ML string literals may refer Isabelle symbols like
``🍋‹α›'' natively, 🪙‹without› escaping the backslash. This is a consequence
of Isabelle treating all source text as strings of symbols, instead of raw
characters.
\begin{warn}
The regular 🍋‹64_32› platform of Poly/ML has a size limit of 64\,MB for 🪙‹string› values. This is usually sufficient fortext
applications, with a little bit of YXML markup. Very large XML trees or
binary blobs are better stored as scalable byte strings, see type 🪙‹Bytes.T›and corresponding operations in 🍋‹~~/src/Pure/General/bytes.ML›. \end{warn} ›
text %mlex ‹ The subsequent example illustrates the difference of physical addressing of bytes versus logical addressing of symbols in Isabelle strings. ›
ML_val ‹ val s = "A"; 🍋 (length (Symbol.explode s) = 1); 🍋 (size s = 4); ›
text‹ Note that in Unicode renderings of the symbol ‹A›,
like UTF-8 or UTF-16 pose delicate questions about the multi-byte
representations of its codepoint, which is outside of the 16-bit address
space of the original Unicode standard from the 1990-ies. In Isabelle/ML it is just ``🍋‹A›'' literally, using plain ASCII characters beyond any
doubts. ›
subsection‹Integers›
text %mlref ‹ \begin{mldecls} @{define_ML_type int} \\ \end{mldecls} 🪙 Type 🪙‹int› r 🪙‹unbounded›. Overflow is treated properly, but should never happen in
practice.🪙‹The size limit for integer bit patterns in memory is 64\,MB for the regular 🍋‹64_32› p
architecture.›
Structure🪙‹IntInf› of SML97 is obsolete and superseded by 🪙‹Int›. Structure🪙‹Integer›in 🍋‹~~/src/Pure/General/integer.ML› provides some additional operations. ›
subsection‹Rational numbers›
text %mlref ‹ \begin{mldecls} @{define_ML_type Rat.rat} \\ \end{mldecls} 🪙 Type 🪙‹Rat.rat› r
unbounded integers of Poly/ML.
Literal rationals may be written with special antiquotation syntax 🍋‹@›\<open>int›🍋‹/›\<open>nat› or 🍋‹@›\<open>int› (without any white space). For example 🍋‹@~1/4› or 🍋‹@10›. The ML toplevel pretty printer uses the same format.
Standard operations are provided via ad-hoc overloading of 🍋‹+›, 🍋‹-›, 🍋‹*›, 🍋‹/›, etc. ›
subsection‹Time›
text %mlref ‹ \begin{mldecls} @{define_ML_type Time.time} \\ @{define_ML seconds: "real -> Time.time"} \\ \end{mldecls} 🪙 Type 🪙‹Time.time› r
SML97 basis library definition. This is adequate for internal ML operations,
but awkward in concrete time specifications.
🪙🪙‹seconds›~‹s› turns the concrete scalar ‹s› (measured in seconds) into
an abstract time value. Floating point numbers are easy touse as
configuration options in the context (see \secref{sec:config-options}) or
system options that are maintained externally. ›
text‹ Apart from 🪙‹Option.map› m 🪙‹Option› are alien to Isabelle/ML and never used. The
operations shown above are defined in🍋‹~~/src/Pure/General/basics.ML›.
Note that the function🪙‹the_default›is strict in all of its
arguments, the default valueis evaluated beforehand, even if not required
later. In contrast, the antiquotation @{ML_antiquotation "if_none"} is
non-strict: the given expression is only evaluated for an application to 🪙‹NONE›. This allows to work with exceptions like this: ›
ML ‹ fun div_total x y = 🍋‹x div y› |
fun div_error x y = 🍋‹x div y› |> 🍋‹error "Division by zero"›; ›
text‹ Of course, it is also possible to handle exceptions directly, without an intermediate option construction: ›
ML ‹ fun div_total x y = x div y handle Div => 0; fun div_error x y = x div y handle Div => error "Division by zero"; ›
text‹ The first form works better in longer chains of functional composition, with combinators like 🪙‹|>› o
adequate in elementary expressions: there is no need to pretend that
Isabelle/ML is actually a version of Haskell. ›
subsection‹Lists›
text‹ Lists are ubiquitous in ML as simple and light-weight ``collections'' for many everyday programming tasks. Isabelle/ML provides important additions and improvements over operations that are predefined in the SML97 library. ›
text %mlref ‹ \begin{mldecls} @{define_ML cons: "'a -> 'a list -> 'a list"} \\ @{define_ML member: "('b * 'a -> bool) -> 'a list -> 'b -> bool"} \\ @{define_ML insert: "('a * 'a -> bool) -> 'a -> 'a list -> 'a list"} \\ @{define_ML remove: "('b * 'a -> bool) -> 'b -> 'a list -> 'a list"} \\ @{define_ML update: "('a * 'a -> bool) -> 'a -> 'a list -> 'a list"} \\ \end{mldecls} 🪙🪙‹cons›~\
Tupled infix operators are a historical accident in Standard ML. The curried 🪙‹cons› amends this, but it should be only used when partial application is required.
🪙🪙‹member›, 🪙‹insert›, 🪙‹remove›, 🪙‹update› treat lists as a
set-like container that maintains the order of elements. See 🍋‹~~/src/Pure/library.ML›for the full specifications (written in ML).
There are some further derived operations like 🪙‹union› or 🪙‹inter›.
Note that 🪙‹insert›is conservative about elements that are already a 🪙‹member› of the list, while 🪙‹update› ensures that the latest entry is always put in front. The latter discipline is often more appropriate in
declarations of context data (\secref{sec:context-data}) that are issued by
the user in Isar source: later declarations take precedence over earlier
ones. ›
text %mlex ‹ Using canonical 🪙‹fold› t
operations) alternates the orientation of data. The is quite natural and
should not be altered forcible by inserting extra applications of 🪙‹rev›.
The alternative 🪙‹fold_rev› can be used in the few situations, where
alternation should be prevented. ›
text‹ Here the first list is treated conservatively: only the new elements from the second list are inserted. The inside-out order of insertion via 🪙‹fold_rev› a
This way of merging lists is typical forcontext data
(\secref{sec:context-data}). See also🪙‹merge› as defined in 🍋‹~~/src/Pure/library.ML›. ›
subsection‹Association lists›
text‹ The operations for association lists interpret a concrete list of pairs as a finite function from keys to values. Redundant representations with multiple occurrences of the same key are implicitly normalized: lookup and update only take the first occurrence into account. ›
text‹ \begin{mldecls} @{define_ML AList.lookup: "('a * 'b -> bool) -> ('b * 'c) list -> 'a -> 'c option"} \\ @{define_ML AList.defined: "('a * 'b -> bool) -> ('b * 'c) list -> 'a -> bool"} \\ @{define_ML AList.update: "('a * 'a -> bool) -> 'a * 'b -> ('a * 'b) list -> ('a * 'b) list"} \\ \end{mldecls} 🪙🪙‹AList.lookup›,
main ``framework operations'' for mappings in Isabelle/ML, following
standard conventions for their names andtypes.
Note that a function called 🍋‹lookup›is obliged to express its partiality
via an explicit option element. There is no choice to raise an exception,
without changing the name to something like ‹the_element› or ‹get›.
The ‹defined› operation is essentially a contraction of 🪙‹is_some›and 🍋‹lookup›, but this is sufficiently frequent to justify its independent
existence. This also gives the implementation some opportunity for peep-hole
optimization.
Association lists are adequate as simple implementation of finite mappings in many practical situations. A more advanced table structureis defined in 🍋‹~~/src/Pure/General/table.ML›; that version scales easily to thousands or
millions of elements. ›
subsection‹Unsynchronized references›
text %mlref ‹ \begin{mldecls} @{define_ML_type 'a "Unsynchronized.ref"} \\ @{define_ML Unsynchronized.ref: "'a -> 'a Unsynchronized.ref"} \\ @{define_ML "!": "'a Unsynchronized.ref -> 'a"} \\ @{define_ML_infix ":=" : "'a Unsynchronized.ref * 'a -> unit"} \\ \end{mldecls} ›
text‹ Due to ubiquitous parallelism in Isabelle/ML (see also \secref{sec:multi-threading}), the mutable reference cells of Standard ML are notorious for causing problems. In a highly parallel system, both correctness 🪙‹and› p
The unwieldy name of 🪙‹Unsynchronized.ref›for the constructor for
references in Isabelle/ML emphasizes the inconveniences caused by
mutability. Existing operations 🪙‹!›and🪙‹:=› are unchanged,
but should be used with special precautions, say in a strictly local
situation that is guaranteed to be restricted to sequential evaluation ---
now andin the future.
\begin{warn}
Never 🪙‹open Unsynchronized›, not even in a local scope!
Pretending that mutable state is no problem is a very bad idea. \end{warn} ›
text‹ Multi-threaded execution has become an everyday reality in Isabelle since Poly/ML 5.2.1 and Isabelle2008. Isabelle/ML provides implicit and explicit parallelism by default, and there is no way for user-space tools to ``opt out''. ML programs that are purely functional, output messages only via the official channels (\secref{sec:message-channels}), and do not intercept interrupts (\secref{sec:exceptions}) can participate in the multi-threaded environment immediately without further ado. More ambitious tools with more fine-grained interaction with the environment need to observe the principles explained below. ›
subsection‹Multi-threading with shared memory›
text‹ Multiple threads help to organize advanced operations of the system, such as real-time conditions on command transactions, sub-components with explicit communication, general asynchronous interaction etc. Moreover, parallel evaluation is a prerequisite to make adequate use of the CPU resources that are available on multi-core systems.🪙‹Multi-core computing does not mean that there are ``spare cycles'' to be wasted. It means that the continued exponential speedup of CPU performance due to ``Moore's Law'' follows different rules: clock frequency has reached its peak around 2005, and applications need to be parallelized in order to avoid a perceived loss of performance. See also 🍋‹"Sutter:2005"›.\
Isabelle/Isar exploits the inherent structure of theories and proofs to
support 🪙‹implicit parallelism›to a large extent. LCF-style theorem proving
provides almost ideal conditions for that, see also🍋‹"Wenzel:2009"›.
This means, significant parts of theoryandproof checking is parallelized by default. In Isabelle2013, a maximum speedup-factor of 3.5 on 4 cores and
6.5 on 8 cores can be expected 🍋‹"Wenzel:2013:ITP"›.
🪙
ML threads lack the memory protection of separate processes, and operate
concurrently on shared heap memory. This has the advantage that results of
independent computations are directly available to other threads: abstract
values can be passed without copying or awkward serialization that is
typically required for separate processes.
To make shared-memory multi-threading work robustly and efficiently, some
programming guidelines need to be observed. While the ML system is
responsible to maintain basic integrity of the representation of ML values in memory, the application programmer needs to ensure that multi-threaded
execution does not break the intended semantics.
\begin{warn} To participate in implicit parallelism, tools need to be thread-safe. A
single ill-behaved tool can affect the stability and performance of the
whole system. \end{warn}
Apart from observing the principles of thread-safeness passively, advanced
tools may also exploit parallelism actively, e.g.\ byusing library
functions for parallel list operations (\secref{sec:parlist}).
\begin{warn}
Parallel computing resources are managed centrally by the Isabelle/ML
infrastructure. User programs should not fork their own ML threads to
perform heavy computations. \end{warn} ›
subsection‹Critical shared resources›
text‹ Thread-safeness is mainly concerned about concurrent read/write access to shared resources, which are outside the purely functional world of ML. This covers the following in particular. 🪙 Global references (or arrays), i.e.\ mutable memory cells that persist over several invocations of associated operations.🪙‹This is independent of the visibility of such mutable values in the toplevel scope.›
🪙Global state of the running Isabelle/ML process, i.e.\ raw I/O channels,
environment variables, current working directory.
🪙 Writable resources in the file-system that are shared among different
threads or external processes.
Isabelle/ML provides various mechanisms to avoid critical shared resources in most situations. As last resort there are some mechanisms for explicit
synchronization. The following guidelines helpto make Isabelle/ML programs
work smoothly in a concurrent environment.
🪙 Avoid global references altogether. Isabelle/Isar maintains a uniform context that incorporates arbitrary data declared by user programs
(\secref{sec:context-data}). This contextis passed as plain valueand user
tools can get/map their own data in a purely functional manner.
Configuration options within the context (\secref{sec:config-options})
provide simple drop-in replacements for historic reference variables.
🪙 Keep components withlocal state information re-entrant. Instead of poking
initial values into (private) global references, a new state record can be
created on each invocation, and passed through any auxiliary functions of
the component. The state record contain mutable references in special
situations, without requiring any synchronization, as long as each
invocation gets its own copy and the tool itself is single-threaded.
🪙 Avoid raw output on ‹stdout› or ‹stderr›. The Poly/ML library is
thread-safe for each individual output operation, but the ordering of
parallel invocations is arbitrary. This means raw output will appear on some
system console with unpredictable interleaving of atomic chunks.
Note that this does not affect regular message output channels
(\secref{sec:message-channels}). An official message id is associated with
the command transaction fromwhere it originates, independently of other
transactions. This means each running Isar command has effectively its own
set of message channels, and interleaving can only happen when commands use
parallelism internally (and only at message boundaries).
🪙 Treat environment variables and the current working directory of the
running process as read-only.
🪙Restrict writing to the file-system to unique temporary files. Isabelle
already provides a temporary directory that is unique for the running
process, and there is a centralized source of unique serial numbers in
Isabelle/ML. Thus temporary files that are passed toto some external
process will be always disjoint, andthus thread-safe. ›
text %mlref ‹ \begin{mldecls} @{define_ML File.tmp_path: "Path.T -> Path.T"} \\ @{define_ML serial_string: "unit -> string"} \\ \end{mldecls} 🪙🪙‹File.tmp_path›~\
unique temporary directory of the running Isabelle/ML process.
🪙🪙‹serial_string›~‹()› creates a new serial number that is unique over
the runtime of the Isabelle/ML process. ›
text %mlex ‹ The following example shows how to create unique temporary file names. ›
text‹ Isabelle/ML provides explicit synchronization for mutable variables over immutable data, which may be updated atomically and exclusively. This addresses the rare situations where mutable shared resources are really required. Synchronization in Isabelle/ML is based on primitives of Poly/ML, which have been adapted to the specific assumptions of the concurrent Isabelle environment. User code should not break this abstraction, but stay within the confines of concurrent Isabelle/ML. A 🪙‹synchronized variable› i
mechanisms for locking and signaling. There are operations to await a
condition, change the state, and signal the change to all other waiting
threads. Synchronized access to the state variable is🪙‹not› re-entrant:
direct or indirect nesting within the same thread will cause a deadlock! ›
text %mlref ‹ \begin{mldecls} @{define_ML_type 'a "Synchronized.var"} \\ @{define_ML Synchronized.var: "string -> 'a -> 'a Synchronized.var"} \\ @{define_ML Synchronized.guarded_access: "'a Synchronized.var -> ('a -> ('b * 'a) option) -> 'b"} \\ \end{mldecls} 🪙 Type 🪙‹'a Synchronized.var› r with state of type 🪙‹'a›.
🪙🪙‹Synchronized.var›~‹name x› creates a synchronized variable that is
initialized withvalue‹x›. The ‹name›is used for tracing.
🪙🪙‹Synchronized.guarded_access›~‹var f› lets the function‹f› operate
within a critical section on the state ‹x› as follows: if‹f x› produces 🪙‹NONE›, it continues to wait on the internal condition variable,
expecting that some other thread will eventually change the content in a
suitable manner; if‹f x› produces 🪙‹SOME›~‹(y, x')› it is satisfied and
assigns the new state value‹x'›, broadcasts a signal to all waiting threads
on the associated condition variable, and returns the result ‹y›.
There are some further variants of the 🪙‹Synchronized.guarded_access›
combinator, see 🍋‹~~/src/Pure/Concurrent/synchronized.ML›for details. ›
text %mlex ‹ The following example implements a counter that produces positive integers that are unique over the runtime of the Isabelle process: ›
ML_val ‹ local val counter = Synchronized.var "counter" 0; in fun next () = Synchronized.guarded_access counter (fn i => let val j = i + 1 in SOME (j, j) end); end; val a = next (); val b = next (); 🍋 (a 🚫b); ›
text‹ 🪙 See 🍋‹~~/src/Pure/Concurrent/mailbox.ML› h
synchronized variable over a purely functional list. ›
text‹ Execution of Standard ML follows the model of strict functional evaluation with optional exceptions. Evaluation happens whenever some function is applied to (sufficiently many) arguments. The result is either an explicit value or an implicit exception. 🪙‹Managed evaluation› i
control certain physical side-conditions, to say more specifically when and
how evaluation happens. For example, the Isabelle/ML library supports lazy
evaluation with memoing, parallel evaluation via futures, asynchronous
evaluation via promises, evaluation with time limit etc.
🪙
An 🪙‹unevaluated expression›is represented either as unit abstraction 🍋‹fn () => a› of type 🍋‹unit -> 'a› or as regular function🍋‹fn a => b› of
type 🍋‹'a -> 'b›. Both forms occur routinely, and special care is required to tell them apart --- the static type-system of SML is only of limited help
here.
The first form is more intuitive: some combinator 🍋‹(unit -> 'a) -> 'a›
applies the given functionto🍋‹()›to initiate the postponed evaluation
process. The second form is more flexible: some combinator 🍋‹('a -> 'b) -> 'a -> 'b› acts like a modified form of function application;
several such combinators may be cascaded to modify a given function, before
it isultimately applied to some argument.
🪙 🪙‹Reified results› make the disjoint sum of regular values versions
exceptional situations explicit as ML datatype: ‹'a result = Res of 'a | Exn of exn›.
overall outcome of an evaluation process.
🪙‹Parallel exceptions› aggregate reified results, such that multiple
exceptions are digested as a collection in canonical form that identifies
exceptions according to their original occurrence. This is particular
important for parallel evaluation via futures \secref{sec:futures}, which
are organized as acyclic graph of evaluations that depend on other
evaluations: exceptions stemming from shared sub-graphs are exposed exactly
once andin the order of their original occurrence (e.g.\ when printed at
the toplevel). Interrupt counts as neutral element here: it is treated as
minimal information about some canceled evaluation process, andis absorbed by the presence of regular program exceptions. ›
text %mlref ‹ \begin{mldecls} @{define_ML_type 'a "Exn.result"} \\ @{define_ML Exn.capture: "('a -> 'b) -> 'a -> 'b Exn.result"} \\ @{define_ML Exn.result: "('a -> 'b) -> 'a -> 'b Exn.result"} \\ @{define_ML Exn.release: "'a Exn.result -> 'a"} \\ @{define_ML Par_Exn.release_all: "'a Exn.result list -> 'a list"} \\ @{define_ML Par_Exn.release_first: "'a Exn.result list -> 'a list"} \\ \end{mldecls} 🪙 Type 🪙‹'a Exn.result› r
explicitly, with constructor 🪙‹Exn.Res›for regular values and🪙‹Exn.Exn›for exceptions.
🪙🪙‹Exn.capture›~‹f x› manages the evaluation of ‹f x› such that
exceptions are made explicit as 🪙‹Exn.Exn›. Note that this includes
physical interrupts (see also\secref{sec:exceptions}), so the same
precautions applyto user code: interrupts must not be absorbed
accidentally!
🪙🪙‹Exn.result›is similar to🪙‹Exn.capture›, but interrupts are
immediately re-raised as required for user code.
🪙🪙‹Exn.release›~‹result› releases the original runtime result, exposing
its regular value or raising the reified exception.
🪙🪙‹Par_Exn.release_all›~‹results› combines results that were produced
independently (e.g.\ by parallel evaluation). If all results are regular
values, that list is returned. Otherwise, the collection of all exceptions is raised, wrapped-up as collective parallel exception. Note that the latter
prevents access to individual exceptions by conventional 🍋‹handle› of ML.
🪙🪙‹Par_Exn.release_first›is similar to🪙‹Par_Exn.release_all›, but
only the first (meaningful) exception that has occurred in the original
evaluation process is raised again, the others are ignored. That single
exception may get handled by conventional means in ML. ›
text‹ Algorithmic skeletons are combinators that operate on lists in parallel, in the manner of well-known ‹map›,
futures (\secref{sec:futures}) and their results as reified exceptions is
wrapped up into simple programming interfaces that resemble the sequential
versions.
What remains is the application-specific problem to present expressions with
suitable 🪙‹granularity›: each list element corresponds to one evaluation
task. If the granularity is too coarse, the available CPUs are not
saturated. If it is too fine-grained, CPU cycles are wasted due to the
overhead of organizing parallel processing. In the worst case, parallel
performance will be less than the sequential counterpart! ›
text %mlref ‹ \begin{mldecls} @{define_ML Par_List.map: "('a -> 'b) -> 'a list -> 'b list"} \\ @{define_ML Par_List.get_some: "('a -> 'b option) -> 'a list -> 'b option"} \\ \end{mldecls} 🪙🪙‹Par_List.map›~\
x🪙n]›, but the evaluation of ‹f x🪙i›for‹i = 1, …, n›is performed in
parallel.
An exception in any ‹f x🪙i› cancels the overall evaluation process. The
final result is produced via 🪙‹Par_Exn.release_first› as explained above,
which means the first program exception that happened to occur in the
parallel evaluation is propagated, and all other failures are ignored.
🪙🪙‹Par_List.get_some›~‹f [x🪙1, …, x🪙n]› produces some ‹f x🪙i› that is of
the form ‹SOME y🪙i›, if that exists, otherwise‹NONE›. Thus it is similar to 🪙‹Library.get_first›, but subject to a non-deterministic parallel choice
process. The first successful result cancels the overall evaluation process;
other exceptions are propagated as for🪙‹Par_List.map›.
This generic parallel choice combinator is the basis for derived forms, such
as 🪙‹Par_List.find_some›, 🪙‹Par_List.exists›, 🪙‹Par_List.forall›. ›
text %mlex ‹ Subsequently, the Ackermann function is evaluated in parallel for some ranges of arguments. ›
ML_val ‹ fun ackermann 0 n = n + 1 | ackermann m 0 = ackermann (m - 1) 1 | ackermann m n = ackermann (m - 1) (ackermann m (n - 1)); Par_List.map (ackermann 2) (500 upto 1000); Par_List.map (ackermann 3) (5 upto 10); ›
subsection‹Lazy evaluation›
text‹ Classic lazy evaluation works via the ‹lazy›~/ ‹lazy›to wrap an unevaluated expression, and‹force›to evaluate it once and store its result persistently. Later invocations of ‹force› retrieve the
stored result without another evaluation. Isabelle/ML refines this idea to
accommodate the aspects of multi-threading, synchronous program exceptions and asynchronous interrupts.
The first thread that invokes ‹force› on an unfinished lazyvalue changes
its state into a 🪙‹promise› of the eventual result and starts evaluating it.
Any other threads that ‹force› the same lazyvaluein the meantime need to
wait for it to finish, by producing a regular result or program exception. If the evaluation attempt is interrupted, this event is propagated to all
waiting threads and the lazyvalueis reset to its original state.
This means a lazyvalueis completely evaluated at most once, in a
thread-safe manner. There might be multiple interrupted evaluation attempts, and multiple receivers of intermediate interrupt events. Interrupts are 🪙‹not› made persistent: later evaluation attempts start again from the
original expression. ›
text %mlref ‹ \begin{mldecls} @{define_ML_type 'a "lazy"} \\ @{define_ML Lazy.lazy: "(unit -> 'a) -> 'a lazy"} \\ @{define_ML Lazy.value: "'a -> 'a lazy"} \\ @{define_ML Lazy.force: "'a lazy -> 'a"} \\ \end{mldecls} 🪙 Type 🪙‹'a lazy› r
🪙🪙‹Lazy.lazy›~‹(fn () => e)› wraps the unevaluated expression ‹e› as
unfinished lazyvalue.
🪙🪙‹Lazy.value›~‹a› wraps the value‹a› as finished lazyvalue. When
forced, it returns ‹a› without any further evaluation.
There is very low overhead for this proforma wrapping of strict values as lazy values.
🪙🪙‹Lazy.force›~‹x› produces the result of the lazyvaluein a
thread-safe manner as explained above. Thus it may cause the current thread to wait on a pending evaluation attempt by another thread. ›
subsection‹Futures \label{sec:futures}›
text‹ Futures help to organize parallel execution in a value-oriented manner, with ‹fork›~/
see also🍋‹"Wenzel:2009" and "Wenzel:2013:ITP"›. Unlike lazy values,
futures are evaluated strictly and spontaneously on separate worker threads.
Futures may be canceled, which leads to interrupts on running evaluation
attempts, and forces structurally related futures to fail for all time;
already finished futures remain unchanged. Exceptions between related
futures are propagated as well, and turned into parallel exceptions (see
above).
Technically, a future is a single-assignment variable together with a 🪙‹task› that serves administrative purposes, notably within the 🪙‹task queue›
worker threads retrieve their work.
The pool of worker threads is limited, in correlation with the number of
physical cores on the machine. Note that allocation of runtime resources may
be distorted either if workers yield CPU time (e.g.\ via system sleep or
wait operations), or if non-worker threads contend for significant runtime
resources independently. There is a limited number of replacement worker
threads that get activated in certain explicit wait conditions, after a
timeout.
🪙
Each future task belongs to some 🪙‹task group›, which represents the
hierarchic structure of related tasks, together with the exception status a
that point. By default, the task group of a newly created future is a new
sub-group of the presently running one, but it isalso possible to indicate
different group layouts under program control.
Cancellation of futures actually refers to the corresponding task group and
all its sub-groups. Thus interrupts are propagated down the group hierarchy.
Regular program exceptions are treated likewise: failure of the evaluation
of some future task affects its own group and all sub-groups. Given a
particular task group, its 🪙‹group status› cumulates all relevant exceptions
according to its position within the group hierarchy. Interrupted tasks that
lack regular result information, will pick up parallel exceptions from the
cumulative group status.
🪙
A 🪙‹passive future› or 🪙‹promise›is a future with slightly different
evaluation policies: there is only a single-assignment variable and some
expression to evaluate for the 🪙‹failed›case (e.g.\ to clean up resources
when canceled). A regular result is produced by external means, using a
separate 🪙‹fulfill› operation.
Promises are managed in the same task queue, so regular futures may depend
on them. This allows a form of reactive programming, where some promises are
used as minimal elements (or guards) within the future dependency graph:
when these promises are fulfilled the evaluation of subsequent futures
starts spontaneously, according to their own inter-dependencies. ›
text %mlref ‹ \begin{mldecls} @{define_ML_type 'a "future"} \\ @{define_ML Future.fork: "(unit -> 'a) -> 'a future"} \\ @{define_ML Future.forks: "Future.params -> (unit -> 'a) list -> 'a future list"} \\ @{define_ML Future.join: "'a future -> 'a"} \\ @{define_ML Future.joins: "'a future list -> 'a list"} \\ @{define_ML Future.value: "'a -> 'a future"} \\ @{define_ML Future.map: "('a -> 'b) -> 'a future -> 'b future"} \\ @{define_ML Future.cancel: "'a future -> unit"} \\ @{define_ML Future.cancel_group: "Future.group -> unit"} \\[0.5ex] @{define_ML Future.promise: "(unit -> unit) -> 'a future"} \\ @{define_ML Future.fulfill: "'a future -> 'a -> unit"} \\ \end{mldecls} 🪙 Type 🪙‹'a future› r
🪙🪙‹Future.fork›~‹(fn () => e)› registers the unevaluated expression ‹e›
as unfinished future value, to be evaluated eventually on the parallel
worker-thread farm. This is a shorthand for🪙‹Future.forks› below, with
default parameters and a single expression.
🪙🪙‹Future.forks›~‹params exprs›is the general interface to fork several
futures simultaneously. The ‹params› consist of the following fields:
🪙‹name : string› (default 🪙‹""›) specifies a common name for the
tasks of the forked futures, which serves diagnostic purposes.
🪙‹group : Future.group option› (default 🪙‹NONE›) specifies an optional
task group for the forked futures. 🪙‹NONE› means that a new sub-group
of the current worker-thread task contextis created. If this is not a
worker thread, the group will be a new root in the group hierarchy.
🪙‹deps : Future.task list› (default 🪙‹[]›) specifies dependencies on
other future tasks, i.e.\ the adjacency relation in the global task queue.
Dependencies on already finished tasks are ignored.
🪙‹pri : int› (default 🪙‹0›) specifies a priority within the task
queue.
Typically there is only little deviation from the default priority 🪙‹0›. As a rule of thumb, 🪙‹~1› means ``low priority" and 🪙‹1› means ``high priority''.
Note that the task priority only affects the position in the queue, not the thread priority. When a worker thread picks up a task for processing, it runs with the normal thread priority to the end (or until canceled). Higher priority tasks that are queued later need to wait until this (or another) worker thread becomes free again.
🪙‹interrupts : bool› (default 🪙‹true›) tells whether the worker thread that processes the corresponding task is initially put into interruptible state. This state may change again while running, by modifying the thread attributes.
With interrupts disabled, a running future task cannot be canceled. It is the responsibility of the programmer that this special state is retained only briefly.
🪙🪙‹Future.join›~‹x› retrieves the value of an already finished future, which may lead to an exception, according to the result of its previous evaluation.
For an unfinished future there are several cases depending on the role of the current thread and the status of the future. A non-worker thread waits passively until the future is eventually evaluated. A worker thread temporarily changes its task context and takes over the responsibility to evaluate the future expression on the spot. The latter is done in a thread-safe manner: other threads that intend to join the same future need to wait until the ongoing evaluation is finished.
Note that excessive use of dynamic dependencies of futures by adhoc joining may lead to bad utilization of CPU cores, due to threads waiting on other threads to finish required futures. The future task farm has a limited amount of replacement threads that continue working on unrelated tasks after some timeout.
Whenever possible, static dependencies of futures should be specified explicitly when forked (see ‹deps› above). Thus the evaluation can work from the bottom up, without join conflicts and wait states.
🪙🪙‹Future.joins›~‹xs› joins the given list of futures simultaneously, which is more efficient than 🪙‹map Future.join›~‹xs›.
Based on the dependency graph of tasks, the current thread takes over the responsibility to evaluate future expressions that are required for the main result, working from the bottom up. Waiting on future results that are presently evaluated on other threads only happens as last resort, when no other unfinished futures are left over.
🪙🪙‹Future.value›~‹a› wraps the value ‹a› as finished future value, bypassing the worker-thread farm. When joined, it returns ‹a› without any further evaluation.
There is very low overhead for this proforma wrapping of strict values as futures.
🪙🪙‹Future.map›~‹f x› is a fast-path implementation of 🪙‹Future.fork›~‹(fn () => f (›\<^ML>‹Future.join›~‹x))›, which avoids the full overhead of the task queue and worker-thread farm as far as possible. The function ‹f› is supposed to be some trivial post-processing or projection of the future result.
🪙🪙‹Future.cancel›~‹x› cancels the task group of the given future, using 🪙‹Future.cancel_group› below.
🪙🪙‹Future.cancel_group›~‹group› cancels all tasks of the given task group for all time. Threads that are presently processing a task of the given group are interrupted: it may take some time until they are actually terminated. Tasks that are queued but not yet processed are dequeued and forced into interrupted state. Since the task group is itself invalidated, any further attempt to fork a future that belongs to it will yield a canceled result as well.
🪙🪙‹Future.promise›~‹abort› registers a passive future with the given ‹abort› operation: it is invoked when the future task group is canceled.
🪙🪙‹Future.fulfill›~‹x a› finishes the passive future ‹x› by the given value ‹a›. If the promise has already been canceled, the attempt to fulfill it causes an exception. \
end
Messung V0.5 in Prozent
¤ Dauer der Verarbeitung: 0.103 Sekunden
(vorverarbeitet am 2026-05-05)
¤
Die Informationen auf dieser Webseite wurden
nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit,
noch Qualität der bereit gestellten Informationen zugesichert.
Bemerkung:
Die farbliche Syntaxdarstellung und die Messung sind noch experimentell.