theory Model imports
ConcurrentIMP.CIMP "HOL-Library.Sublist" begin
(* From 40e16c534243 by Makarius. Doesn't seem to have a huge impact on run time now (2021-01-07) *) declare subst_all [simp del] [[simproc del: defined_all]]
(*>*) section‹A model of a Schism garbage collector \label{sec:gc-model}›
text‹
following formalises Figures~2.8 (‹mark_object_fn›),
.9 (load and store but not alloc), and 2.15 (garbage collector) of 🍋‹"Pizlo201xPhd"›; see also 🍋‹"Pizlo+2010PLDI"›.
additionally need to model TSO memory, the handshakes and
-and-swap (\texttt{CAS}). We closely model things where
is possible and abstract everything else.
{bold ‹@{emph ‹NOTE›}: this model is for TSO
emph{only}. We elide any details irrelevant for that memory
.›}
begin by defining the types of the various parts. Our program
are labelled with strings for readability. We enumerate the
of the processes in our system. The safety proof treats an
(unbounded) number of mutators.
garbage collector instructs mutators to perform certain actions,
blocks until the mutators signal these actions are done. The
always respond with their work list (a set of
). The handshake can be of one of the specified types.
object consists of a garbage collection mark and two partial
. Firstly the types:
▪ @{typ "'field"} is the abstract type of fields. ▪ @{typ "'ref"} is the abstract type of object references. ▪ @{typ "'mut"} is the abstract type of the mutators' names.
maps:
▪‹obj_fields› maps @{typ "'field"}s to object
references (or @{const "None"} signifying \texttt{NULL} or type
error). ▪‹obj_payload› maps a @{typ "'field"} to non-reference
data. For convenience we similarly allow that to be \texttt{NULL}.
abbreviation"LoadfM ≡ ro_Load mr_fM" abbreviation"LoadMark r ≡ ro_Load (mr_Mark r)" abbreviation"LoadPayload r f ≡ ro_Load (mr_Payload r f)" abbreviation"LoadPhase ≡ ro_Load mr_Phase" abbreviation"LoadRef r f ≡ ro_Load (mr_Ref r f)"
abbreviation"StorefA m ≡ ro_Store (mw_fA m)" abbreviation"StorefM m ≡ ro_Store (mw_fM m)" abbreviation"StoreMark r m ≡ ro_Store (mw_Mark r m)" abbreviation"StorePayload r f pl ≡ ro_Store (mw_Mutate_Payload r f pl)" abbreviation"StorePhase ph ≡ ro_Store (mw_Phase ph)" abbreviation"StoreRef r f r' ≡ ro_Store (mw_Mutate r f r')"
use one locale per process to define a namespace for definitions
to these processes. Mutator definitions are parametrised by the
's identifier ‹m›. We never interpret these locales; we
use their contents by prefixing identifiers with the locale
. This might be considered an abuse. The attributes depend on
scoping somewhat, which is a mixed blessing.
we have more than one mutator then we need to show that mutators do
mutually interfere. To that end we define an extra locale that
these proofs.
the mutators and the garbage collector mark references, which
that a reference is live in the current round of
. This operation is defined in 🍋‹‹Figure~2.8› in "Pizlo201xPhd"›. These definitions are
by the name of the process.
abbreviation
add_to_W_syn :: "location ==> (('field, 'mut, 'payload, 'ref) local_state ==> 'ref) ==> ('field, 'mut, 'payload, 'ref) gc_com" where "add_to_W_syn l r ≡{l}⌊λs. s( W := W s ∪ {r s}, ghost_honorary_grey := {} )⌋" notation add_to_W_syn (‹{_} add'_to'_W›)
text‹
reference we're marking is given in @{const "ref"}. If the current
wins the \texttt{CAS} race then the reference is marked and
to the local work list @{const "W"}.
means we cannot avoid having the mark store pending in a store
; in other words, we cannot have objects atomically transition
white to grey. The following scheme blackens a white object, and
reverts it to grey. The @{const "ghost_honorary_grey"} variable
used to track objects undergoing this transition.
CIMP provides no support for function calls, we prefix each
's label with a string from its callsite.
›
definition
mark_object_fn :: "location ==> ('field, 'mut, 'payload, 'ref) gc_com" where "mark_object_fn l = {l @ ''_mo_null''} IF \<not> (NULL ref) THEN {l @ ''_mo_mark''} load_mark (the ∘ ref) mark_update ;; {l @ ''_mo_fM''} load_fM ;; {l @ ''_mo_mtest''} IF mark \<noteq> Some ∘ fM THEN {l @ ''_mo_phase''} load_phase ;; {l @ ''_mo_ptest''} IF phase \<noteq> ⟨ph_Idle⟩ THEN ―‹CAS: claim object› {l @ ''_mo_co_lock''} lock ;; {l @ ''_mo_co_cmark''} load_mark (the ∘ ref) cas_mark_update ;; {l @ ''_mo_co_ctest''} IF cas_mark = mark THEN {l @ ''_mo_co_mark''} store_mark (the ∘ ref) fM FI ;; {l @ ''_mo_co_unlock''} unlock ;; {l @ ''_mo_co_won''} IF cas_mark = mark THEN {l @ ''_mo_co_W''} add_to_W (the ∘ ref) FI FI FI FI"
end
text‹
worklists (field @{term "W"}) are not subject to TSO. As we later
(\S\ref{def:valid_W_inv}), these are disjoint and hence
on these are private to each process, with the sole
of when the GC requests them from the mutators. We describe
mechanism next.
›
subsection‹Handshakes \label{sec:gc_handshakes}›
text‹
garbage collector needs to synchronise with the mutators.
we do so by having the GC busy-wait: it sets a ‹pending› flag for each mutator
then waits for each to respond.
system side of the interface collects the responses from the
into a single worklist, which acts as a proxy for the garbage
's local worklist during ‹get_roots› and ‹get_work› handshakes.
carefully model the effect these handshakes have on the processes' TSO buffers.
system and mutators track handshake phases using ghost state; see
S\ref{sec:phase-invariants}.
handshake type and handshake pending bit are not subject to TSO as we expect
realistic implementation of handshakes would involve synchronisation.
definition
handshake :: "('field, 'mut, 'payload, 'ref) gc_com" where "handshake = {''sys_hs_gc_set_type''} Response (λreq s. { (s( hs_type := ht, ghost_hs_in_sync := ⟨False⟩, ghost_hs_phase := hp_step ht (ghost_hs_phase s) ), mv_Void) |ht. req = (gc, ro_hs_gc_store_type ht) }) ⊕{''sys_hs_gc_mut_reqs''} Response (λreq s. { (s( hs_pending := (hs_pending s)(m := True) ), mv_Void) |m. req = (gc, ro_hs_gc_store_pending m) }) ⊕{''sys_hs_gc_done''} Response (λreq s. { (s, mv_Bool (¬hs_pending s m)) |m. req = (gc, ro_hs_gc_load_pending m) }) ⊕{''sys_hs_gc_load_W''} Response (λreq s. { (s( W := {} ), mv_Refs (W s)) |_::unit. req = (gc, ro_hs_gc_load_W) }) ⊕{''sys_hs_mut_pending''} Response (λreq s. { (s, mv_Bool (hs_pending s m)) |m. req = (mutator m, ro_hs_mut_load_pending) }) ⊕{''sys_hs_mut''} Response (λreq s. { (s, mv_hs_type (hs_type s)) |m. req = (mutator m, ro_hs_mut_load_type) }) ⊕{''sys_hs_mut_done''} Response (λreq s. { (s( hs_pending := (hs_pending s)(m := False), W := W s ∪ W', ghost_hs_in_sync := (ghost_hs_in_sync s)(m := True) ), mv_Void) |m W'. req = (mutator m, ro_hs_mut_done W') })"
end
text‹
mutators' side of the interface. Also updates the ghost state
the handshake state for @{const "ht_NOOP"} and @{const
ht_GetRoots"} but not @{const "ht_GetWork"}.
we could make these subject to TSO, but that would be over specification.
abbreviation hs_get_work_done_syn :: "location ==> (('field, 'mut, 'payload, 'ref) local_state ==> 'ref set) ==> ('field, 'mut, 'payload, 'ref) gc_com" (‹{_} hs'_get'_work'_done›) where "{l} hs_get_work_done wl ≡{l} Request (λs. (mutator m, ro_hs_mut_done (wl s))) (λ_ s. {s( W := {} )})"
definition
handshake :: "('field, 'mut, 'payload, 'ref) gc_com" where "handshake = {''hs_load_pending''} hs_load_pending_ ;; {''hs_pending''} IF mutator_hs_pending THEN {''hs_mfence''} MFENCE ;; {''hs_load_ht''} hs_load_type ;; {''hs_noop''} IF hs_type =⟨ht_NOOP⟩ THEN {''hs_noop_done''} hs_noop_done_ ELSE {''hs_get_roots''} IF hs_type =⟨ht_GetRoots⟩ THEN {''hs_get_roots_refs''}🍋refs := 🍋roots ;; {''hs_get_roots_loop''} WHILE \<not>EMPTY refs DO {''hs_get_roots_loop_choose_ref''}🍋ref :∈ Some ` 🍋refs ;; {''hs_get_roots_loop''} mark_object ;; {''hs_get_roots_loop_done''}🍋refs := (🍋refs - {the 🍋ref}) OD ;; {''hs_get_roots_done''} hs_get_roots_done_ W ELSE {''hs_get_work''} IF hs_type =⟨ht_GetWork⟩ THEN {''hs_get_work_done''} hs_get_work_done W FI FI FI FI"
system process models the environment in which the garbage
and mutators execute. We translate the x86-TSO memory model
to 🍋‹"DBLP:journals/cacm/SewellSONM10"›
a CIMP process. It is a reactive system: it receives requests and
values, but initiates no communication itself. It can,
, autonomously commit a store pending in a TSO store buffer.
memory bus can be locked by atomic compare-and-swap (\texttt{CAS})
(and others in general). A processor is not blocked
i.e., it can read from memory) when it holds the lock, or no-one
.
›
definition
not_blocked :: "('field, 'mut, 'payload, 'ref) local_state ==> 'mut process_name ==> bool" where "not_blocked s p = (case mem_lock s of None ==> True | Some p' ==> p = p')"
text‹
compute the view a processor has of memory by applying all its
stores.
semantics of TSO memory following 🍋‹‹\S3› in "DBLP:journals/cacm/SewellSONM10"›. This differs
the earlier 🍋‹"DBLP:conf/tphol/OwensSS09"› by allowing the TSO lock to be taken by a
with a non-empty store buffer. We omit their treatment of
; these are handled by the local states of the other
. The system can autonomously take the oldest store in the
buffer for processor ‹p› and commit it to memory, ‹p› either holds the lock or no processor does.
›
definition
mem_TSO :: "('field, 'mut, 'payload, 'ref) gc_com" where "mem_TSO = {''tso_load''} Response (λreq s. { (s, sys_load p mr s) |p mr. req = (p, ro_Load mr) ∧ not_blocked s p }) ⊕{''tso_store''} Response (λreq s. { (s( mem_store_buffers := (mem_store_buffers s)(p := mem_store_buffers s p @ [w]) ), mv_Void) |p w. req = (p, ro_Store w) }) ⊕{''tso_mfence''} Response (λreq s. { (s, mv_Void) |p. req = (p, ro_MFENCE) ∧ mem_store_buffers s p = [] }) ⊕{''tso_lock''} Response (λreq s. { (s( mem_lock := Some p ), mv_Void) |p. req = (p, ro_Lock) ∧ mem_lock s = None }) ⊕{''tso_unlock''} Response (λreq s. { (s( mem_lock := None ), mv_Void) |p. req = (p, ro_Unlock) ∧ mem_lock s = Some p ∧ mem_store_buffers s p = [] }) ⊕{''tso_dequeue_store_buffer''} LocalOp (λs. { (do_store_action w s)( mem_store_buffers := (mem_store_buffers s)(p := ws) ) | p w ws. mem_store_buffers s p = w # ws ∧ not_blocked s p ∧ p ≠ sys })"
text‹
track which references are allocated using the domain of @{const
heap"}.
label{sec:sys_alloc}
now we assume that the system process magically allocates and
references.
also arrange for the object to be marked atomically (see
S\ref{sec:mut_alloc}) which morally should be done by the mutator. In
allocation pools enable this kind of atomicity (wrt the sweep
in the GC described in \S\ref{sec:gc-model-gc}).
that the \texttt{abort} in 🍋‹‹Figure~2.9: Alloc› in "Pizlo201xPhd"› means the atomic
and the mutator can revert to activity outside of
texttt{Alloc}, avoiding deadlock. We instead signal the exhaustion of
heap explicitly, i.e., the @{const "ro_Alloc"} action cannot fail.
›
definition
alloc :: "('field, 'mut, 'payload, 'ref) gc_com" where "alloc = {''alloc''} Response (λreq s. if dom (heap s) = UNIV then {(s, mv_Ref None) |_::unit. snd req = ro_Alloc } else { ( s( heap := (heap s)(r := Some ( obj_mark = fA s, obj_fields = Map.empty, obj_payload = Map.empty )) ), mv_Ref (Some r) ) |r. r ∉ dom (heap s) ∧ snd req = ro_Alloc })"
text‹
are freed by removing them from @{const "heap"}.
›
definition
free :: "('field, 'mut, 'payload, 'ref) gc_com" where "free = {''sys_free''} Response (λreq s. { (s(heap := (heap s)(r := None)), mv_Void) |r. snd req = ro_Free r })"
text‹
top-level system process.
›
definition
com :: "('field, 'mut, 'payload, 'ref) gc_com" where "com = LOOP DO mem_TSO ⊕ alloc ⊕ free ⊕ handshake OD"
end
subsection‹Mutators›
text‹
mutators need to cooperate with the garbage collector. In
, when the garbage collector is not idle the mutators use a
emph{write barrier} (see \S\ref{sec:gc-marking}).
local state for each mutator tracks a working set of references,
abstracts from how the process's registers and stack are
to discover roots.
›
context mut_m begin
text‹
label{sec:mut_alloc}
is defined in 🍋‹‹Figure~2.9› in "Pizlo201xPhd"›. See \S\ref{sec:sys_alloc}
how we abstract it.
and store are defined in 🍋‹‹Figure~2.9› in "Pizlo201xPhd"›.
a reference can increase the set of mutator roots.
›
abbreviation load :: "('field, 'mut, 'payload, 'ref) gc_com"where "load ≡ {''mut_load_choose''} LocalOp (λs. { s( tmp_ref := r, field := f ) |r f. r ∈ roots s }) ;; {''mut_load''} Request (λs. (mutator m, LoadRef (tmp_ref s) (field s))) (λmv s. { s( roots := roots s ∪ set_option r ) |r. mv = mv_Ref r })"
text‹
label{sec:write-barriers}
a reference involves marking both the old and new references,
.e., both \emph{insertion} and \emph{deletion} barriers are
. The deletion barrier preserves the \emph{weak tricolour
}, and the insertion barrier preserves the \emph{strong
invariant}; see \S\ref{sec:strong-tricolour-invariant} for
discussion.
that the the mutator reads the overwritten reference but does not
it in its roots.
definition
store :: "('field, 'mut, 'payload, 'ref) gc_com" where "store = ―‹Choose vars for ‹ref→field := new_ref›› {''store_choose''} LocalOp (λs. { s( tmp_ref := r, field := f, new_ref := r' ) |r f r'. r ∈ roots s ∧ r' ∈ Some ` roots s ∪ {None} }) ;; ―‹Mark the reference we're about to overwrite. Does not update roots.› {''deref_del''} deref tmp_ref field ref_update ;; {''store_del''} mark_object ;; ―‹Mark the reference we're about to insert.› {''lop_store_ins''}🍋ref := 🍋new_ref ;; {''store_ins''} mark_object ;; {''store_ins''} store_ref tmp_ref field new_ref"
text‹
and store payload data.
›
abbreviation load_payload :: "('field, 'mut, 'payload, 'ref) gc_com"where "load_payload ≡ {''mut_load_payload_choose''} LocalOp (λs. { s( tmp_ref := r, field := f ) |r f. r ∈ roots s }) ;; {''mut_load_payload''} Request (λs. (mutator m, LoadPayload (tmp_ref s) (field s))) (λmv s. { s( mutator_data := (mutator_data s)(var := pl) ) |var pl. mv = mv_Payload pl })"
abbreviation store_payload :: "('field, 'mut, 'payload, 'ref) gc_com"where "store_payload ≡ {''mut_store_payload_choose''} LocalOp (λs. { s( tmp_ref := r, field := f, payload_value := pl s ) |r f pl. r ∈ roots s }) ;; {''mut_store_payload''} Request (λs. (mutator m, StorePayload (tmp_ref s) (field s) (payload_value s))) (λmv s. { s( mutator_data := (mutator_data s)(f := pl) ) |f pl. mv = mv_Payload pl })"
text‹
mutator makes a non-deterministic choice amongst its possible
. For completeness we allow mutators to issue \texttt{MFENCE}
. We leave \texttt{CAS} (etc) to future work. Neither has
significant impact on the rest of the development.
{''mark_loop''} WHILE \<not>EMPTY W DO {''mark_loop_inner''} WHILE \<not>EMPTY W DO {''mark_loop_choose_ref''}🍋tmp_ref :∈🍋W ;; {''mark_loop_fields''}🍋field_set := UNIV ;; {''mark_loop_mark_object_loop''} WHILE \<not>EMPTY field_set DO {''mark_loop_mark_choose_field''}🍋field :∈🍋field_set ;; {''mark_loop_mark_deref''}🍋ref := 🍋tmp_ref→🍋field ;; {''mark_loop''} mark_object ;; {''mark_loop_mark_field_done''}🍋field_set := (🍋field_set - {🍋field}) OD ;; {''mark_loop_blacken''}🍋W := (🍋W - {🍋tmp_ref}) OD ;; {''mark_loop_get_work''} handshake_get_work OD ;;
―‹sweep›
{''mark_end''} store_phase ph_Sweep ;; {''sweep_load_fM''} load_fM ;; {''sweep_refs''}🍋refs := UNIV ;; {''sweep_loop''} WHILE \<not>EMPTY refs DO {''sweep_loop_choose_ref''}🍋tmp_ref :∈🍋refs ;; {''sweep_loop_load_mark''}🍋mark := 🍋tmp_ref→flag ;; {''sweep_loop_check''} IF \<not>NULL mark \<and> the ∘ mark \<noteq> fM THEN {''sweep_loop_free''} free tmp_ref FI ;; {''sweep_loop_ref_done''}🍋refs := (🍋refs - {🍋tmp_ref}) OD ;; {''sweep_idle''} store_phase ph_Idle OD"
Die Informationen auf dieser Webseite wurden
nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit,
noch Qualität der bereit gestellten Informationen zugesichert.
Bemerkung:
Die farbliche Syntaxdarstellung und die Messung sind noch experimentell.