section‹Adaptation to target languages \label{sec:adaptation}›
subsection‹Adapting code generation›
text‹ The aspects of code generation introduced so far have two aspects in common: \begin{itemize} \item They act uniformly, without reference to a specific target language. \item They are \emph{safe} in the sense that as long as you trust the code generator meta theory and implementation, you cannot produce programs that yield results which are not derivable in the logic. \end{itemize} \noindent In this section we will introduce means to \emph{adapt} the serialiser to a specific target language, i.e.~to print program fragments in a way which accommodates \qt{already existing} ingredients of a target language environment, for three reasons: \begin{itemize} \item improving readability and aesthetics of generated code \item gaining efficiency \item interface with language parts which have no direct counterpart in ‹HOL›(say, imperative data structures) \end{itemize} \noindent Generally, you should avoid using those features yourself \emph{at any cost}: \begin{itemize} \item The safe configuration methods act uniformly on every target language, whereas for adaptation you have to treat each target language separately. \item Application is extremely tedious since there is no abstraction which would allow for a static check, making it easy to produce garbage. \item Subtle errors can be introduced unconsciously. \end{itemize} \noindent However, even if you ought refrain from setting up adaptation yourself, already ‹HOL›comes with some reasonable default adaptations (say, using target language list syntax). There also some common adaptation cases which you can setup by importing particular library theories. In order to understand these, we provide some clues here; these however are not supposed to replace a careful study of the sources. ›
subsection‹The adaptation principle›
text‹ Figure \ref{fig:adaptation} illustrates what \qt{adaptation} is conceptually supposed to be: \begin{figure}[h] \begin{tikzpicture}[scale = 0.5] \tikzstyle water=[color = blue, thick] \tikzstyle ice=[color = black, very thick, cap = round, join = round, fill = white] \tikzstyle process=[color = green, semithick, ->] \tikzstyle adaptation=[color = red, semithick, ->] \tikzstyle target=[color = black] \foreach\x in {0, ..., 24} \draw[style=water] (\x, 0.25) sin + (0.25, 0.25) cos + (0.25, -0.25) sin + (0.25, -0.25) cos + (0.25, 0.25); \draw[style=ice] (1, 0) -- (3, 6) node[above, fill=white] {logic} -- (5, 0) -- cycle; \draw[style=ice] (9, 0) -- (11, 6) node[above, fill=white] {intermediate language} -- (13, 0) -- cycle; \draw[style=ice] (15, -6) -- (19, 6) node[above, fill=white] {target language} -- (23, -6) -- cycle; \draw[style=process] (3.5, 3) .. controls (7, 5) .. node[fill=white] {translation} (10.5, 3); \draw[style=process] (11.5, 3) .. controls (15, 5) .. node[fill=white] (serialisation) {serialisation} (18.5, 3); \node (adaptation) at (11, -2) [style=adaptation] {adaptation}; \node at (19, 3) [rotate=90] {generated}; \node at (19.5, -5) {language}; \node at (19.5, -3) {library}; \node (includes) at (19.5, -1) {includes}; \node (reserved) at (16.5, -3) [rotate=72] {reserved}; % proper 71.57 \draw[style=process] (includes) -- (serialisation); \draw[style=process] (reserved) -- (serialisation); \draw[style=adaptation] (adaptation) -- (serialisation); \draw[style=adaptation] (adaptation) -- (includes); \draw[style=adaptation] (adaptation) -- (reserved); \end{tikzpicture} \caption{The adaptation principle} \label{fig:adaptation} \end{figure} \noindent In the tame view, code generation acts as broker between ‹logic›,‹intermediate language› and ‹target language› structure of the ‹language›itself plus some ‹reserved› keywords which have to be avoided for generated code. However, if you consider ‹adaptation›mechanisms, the code generated by the serializer is just the tip of the iceberg: \begin{itemize} \item‹serialisation›can be \emph{parametrised} such that logical entities are mapped to target-specific ones (e.g. target-specific list syntax, see also \secref{sec:adaptation_mechanisms}) \item Such parametrisations can involve references to a target-specific standard ‹library›(e.g. using the ‹Haskell›🍋‹Maybe› type instead of the ‹HOL› 🪙‹option›type); if such are used, the corresponding identifiers (in our example, 🍋‹Maybe›,🍋‹Nothing› and 🍋‹Just›) also have to be considered ‹reserved›. \item Even more, the user can enrich the library of the target-language by providing code snippets (\qt{‹includes›}) which are prepended to any generated code (see \secref{sec:include}); this typically also involves further ‹reserved›identifiers. \end{itemize} \noindent As figure \ref{fig:adaptation} illustrates, all these adaptation mechanisms have to act consistently; it is at the discretion of the user to take care for this. ›
text‹ The 🍋‹Main›theory of Isabelle/HOL already provides a code generator setup which should be suitable for most applications. Common extensions and modifications are available by certain theories in 🍋‹~~/src/HOL/Library›; beside being useful in applications, they may serve as a tutorial for customising the code generator setup (see below \secref{sec:adaptation_mechanisms}). \begin{description} \item[🍋‹HOL.Code_Numeral›] provides additional numeric types 🍋‹integer›and 🍋‹natural› isomorphic to types 🍋‹int›and 🍋‹nat› respectively. Type 🍋‹integer› is mapped to target-language built-in integers; 🍋‹natural› is implemented as abstract type over 🍋‹integer›. Useful for code setups which involve e.g.~indexing of target-language arrays. Part of ‹HOL-Main›. \item[🍋‹HOL.String›] provides an additional datatype 🍋‹String.literal› which is isomorphic to lists of 7-bit (ASCII) characters; 🍋‹String.literal›s are mapped to target-language strings. Literal values of type 🍋‹String.literal›can be written as ‹STR ''…''›for sequences of printable characters and ‹STR 0x…›for one single ASCII code point given as hexadecimal numeral; 🍋‹String.literal›supports concatenation ‹… + …›for all standard target languages. Note that the particular notion of \qt{string} is target-language specific (sequence of 8-bit units, sequence of unicode code points, \ldots); hence ASCII is the only reliable common base e.g.~for printing (error) messages; more sophisticated applications like verifying parsing algorithms require a dedicated target-language specific model. Nevertheless 🍋‹String.literal›s can be analyzed; the core operations for this are 🪙‹String.asciis_of_literal›and 🪙‹String.literal_of_asciis›which are implemented in a target-language-specific way; particularly 🍋‹String.asciis_of_literal› checks its argument at runtime to make sure that it does not contain non-ASCII-characters, to safeguard consistency. On top of these, more abstract conversions like 🪙‹String.explode›and 🪙‹String.implode› are implemented. Part of ‹HOL-Main›. \item[🍋‹HOL-Library.IArray›] provides a type 🍋‹'a iarray› isomorphic to lists but implemented by (effectively immutable) arrays \emph{in SML only}. \end{description} \noindent Using these adaptation setups the following extensions are provided: \begin{description} \item[‹Code_Target_Int›] implements type 🍋‹int› by 🍋‹integer›and thus by target-language built-in integers. \item[‹Code_Binary_Nat›] implements type 🍋‹nat›using a binary rather than a linear representation, which yields a considerable speedup for computations. Pattern matching with 🍋‹0::nat›/ 🍋‹Suc› is eliminated by a preprocessor.\label{abstract_nat} \item[‹Code_Target_Nat›] implements type 🍋‹nat› by 🍋‹integer›and thus by target-language built-in integers. Pattern matching with 🍋‹0::nat›/ 🍋‹Suc› is eliminated by a preprocessor. \item[‹Code_Target_Numeral›] is a convenience theory containing ‹Code_Target_Nat›,‹Code_Target_Int› and ‹Code_Target_Bit_Shifts›- \item[‹Code_Bit_Shifts_for_Arithmetic›] uses the preprocessor to replace arithmetic operations on numeric types by target-language built-in bit shifts whenever feasible. \item[‹Code_Abstract_Char›] implements type 🍋‹char› by target language integers, sacrificing pattern patching in exchange for dramatically increased performance for comparisons. \end{description} ›
text‹ Consider the following function and its corresponding SML code: ›
primrec %quote in_interval :: "nat × nat ==> nat ==> bool"where "in_interval (k, l) n ⟷ k ≤ n ∧ n ≤ l" (*<*)
code_printing %invisible
type_constructor bool ⇀ (SML)
| constant True ⇀ (SML)
| constant False ⇀ (SML)
| constant HOL.conj ⇀ (SML)
| constant Not ⇀ (SML) (*>*) text %quote ‹ @{code_stmts in_interval (SML)} ›
text‹ \noindent Though this is correct code, it is a little bit unsatisfactory: boolean values and operators are materialised as distinguished entities with have nothing to do with the SML-built-in notion of \qt{bool}. This results in less readable code; additionally, eager evaluation may cause programs to loop or break which would perfectly terminate when the existing SML 🍋‹bool› w ›
text‹ \noindent The @{command_def code_printing} command takes a series of symbols (contants, type constructor, \ldots) together with target-specific custom serialisations. Each custom serialisation starts with a target language identifier followed by an expression, which during code serialisation is inserted whenever the type constructor would occur. Each ``🍋‹_›''
placeholder for the constant's or the type constructor's arguments. ›
text %quote ‹ @{code_stmts in_interval (SML)} ›
text‹ \noindent This still is not perfect: the parentheses around the \qt{andalso} expression are superfluous. Though the serialiser by no means attempts to imitate the rich Isabelle syntax framework, it provides some common idioms, notably associative infixes with precedences which may be used here: ›
text‹ \noindent The attentive reader may ask how we assert that no generated code will accidentally overwrite. For this reason the serialiser has an internal table of identifiers which have to be avoided to be used for new declarations. Initially, this table typically contains the keywords of the target language. It can be extended manually, thus avoiding accidental overwrites, using the @{command_def "code_reserved"} command: ›
text‹ \noindent Next, we try to map HOL pairs to SML pairs, using the infix ``🍋‹*›'' › (*<*)
code_printing %invisible
type_constructor prod ⇀ (SML)
| constant Pair ⇀ (SML) (*>*)
code_printing %quotett
type_constructor prod ⇀ (SML) infix 2 "*"
| constant Pair ⇀ (SML) "!((_),/ (_))"
text‹ \noindent The initial bang ``🍋‹!›''
never to put parentheses around the whole expression (they are
already present), while the parentheses around argument place
holders tell not to put parentheses around the arguments. The slash
``🍋‹/›'' (followed by arbitrary white space) inserts a
space which may be used as a break if necessary during pretty
printing.
These examples give a glimpse what mechanisms custom serialisations
provide; however their usage requires careful thinking in order not to introduce inconsistencies -- or, in other words: custom
serialisations are completely axiomatic.
A further noteworthy detail is that any special character in a
custom serialisation may be quoted using ``🍋‹'›''; thus, in ``🍋‹fn '_ => _›'' the first ``🍋‹_›'' is a
proper underscore while the second ``🍋‹_›'' is a
placeholder. ›
subsection‹‹Haskell› serialisation›
text‹ For convenience, the default ‹HOL› s
maps the 🍋‹equal›classto its counterpart in‹Haskell›,
giving custom serialisations for the class🍋‹equal› and its operation @{const [source] HOL.equal}. ›
text‹ \noindent A problem now occurs whenever a type which is an instance of 🍋‹equal› i ‹Eq›: ›
typedecl %quote bar
instantiation %quote bar :: equal begin
definition %quote "HOL.equal (x::bar) y ⟷ x = y"
instance %quote by standard (simp add: equal_bar_def)
end %quote (*<*)
(*>*) code_printing %quotett
type_constructor bar ⇀ (Haskell) "Integer"
text‹ \noindent The code generator would produce an additional instance, which of course is rejected by the ‹Haskell› c
suppress this additional instance: ›
code_printing %quotett
class_instance bar :: "HOL.equal"⇀ (Haskell) -
subsection‹Enhancing the target language context \label{sec:include}›
text‹ In rare cases it is necessary to \emph{enrich} the context of a target language; this can also be accomplished using the @{command "code_printing"} command: ›
code_printing %quotett code_module"Errno"⇀ (Haskell) ‹module Errno(errno) where errno i = error ("Error number: " ++ show i)›
code_reserved %quotett (Haskell) Errno
text‹ \noindent Such named modules are then prepended to every generated code. Inspect such code in order to find out how this behaves with respect to a particular target language. ›
end
Messung V0.5 in Prozent
¤ Dauer der Verarbeitung: 0.16 Sekunden
(vorverarbeitet am 2026-05-01)
¤
Die Informationen auf dieser Webseite wurden
nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit,
noch Qualität der bereit gestellten Informationen zugesichert.
Bemerkung:
Die farbliche Syntaxdarstellung und die Messung sind noch experimentell.