/**************************************************************************** ** ** This file is part of GAP, a system for computational discrete algebra. ** ** Copyright of GAP belongs to its developers, whose names are too numerous ** to list here. Please refer to the COPYRIGHT file for details. ** ** SPDX-License-Identifier: GPL-2.0-or-later ** ** This file contains the functions of Gasman, the GAP storage manager. ** ** {\Gasman} is a storage manager for applications written in C. That means ** that an application using {\Gasman} requests blocks of storage from ** {\Gasman}. Those blocks of storage are called *bags*. Then the ** application writes data into and reads data from the bags. Finally a bag ** is no longer needed and the application simply forgets it. We say that ** such a bag that is no longer needed is *dead*. {\Gasman} cares about the ** allocation of bags and deallocation of dead bags. Thus these operations ** are transparent to the application, enabling the programmer to ** concentrate on algorithms instead of caring about storage allocation and ** deallocation. ** ** {\Gasman} implements an automatic, cooperating, compacting, generational, ** conservative storage management ** ** *Automatic* means that the application only allocates bags. It need not ** explicitly deallocate them. {\Gasman} automatically determines which ** bags are dead and deallocates them. This is done by a process called ** *garbage collection*. Garbage refers to the bags that are dead, and ** collection refers to the process of deallocating them. ** ** *Cooperating* means that the application must cooperate with {\Gasman}, ** that is it must follow two rules. One rule is that it must not remember ** the addresses of the data area of a bag for too long. The other rule is ** that it must inform {\Gasman} when it has changed a bag. ** ** *Compacting* means that after a garbage collection {\Gasman} compacts the ** bags that are still live, so that the storage made available by ** deallocating the dead bags becomes one large contiguous block. This ** helps to avoid *fragmentation* of the free storage. The downside is that ** the address of the data area of a bag may change during a garbage ** collection, which is the reason why the application must not remember ** this address for too long, i.e., must not keep pointers to or into the ** data area of a bag over a garbage collection. ** ** *Generational* means that {\Gasman} distinguishes between old and young ** bags. Old bags have been allocated some time ago, i.e., they survived at ** least one garbage collection. During a garbage collection {\Gasman} will ** first find and deallocate the dead young bags. Only if that does not ** produce enough free storage, {\Gasman} will find and deallocate the dead ** old bags. The idea behind this is that usually most bags have a very ** short life, so that they will die young. The downside is that this ** requires {\Gasman} to quickly find the young bags that are referenced ** from old bags, which is the reason why an application must inform ** {\Gasman} when it has changed a bag. ** ** *Conservative* means that there are situations in which {\Gasman} cannot ** decide with absolute certainty whether a bag is still live or already ** dead. In these situations {\Gasman} has to assume that the bag is still ** live, and may thus keep a bag longer than it is necessary. ** ** What follows describes the reasons for this design, and at the same time ** the assumptions that were made about the application. This is given so ** you can make an educated guess as to whether {\Gasman} is an appropriate ** storage manager for your application. ** ** {\Gasman} is automatic, because this makes it easier to use in large or ** complex applications. Namely in with a non-automatic storage manager the ** application must decide when to deallocate a bag. This requires in ** general global knowledge, i.e., it is not sufficient to know whether the ** current function may still need the bag, one also needs to know whether ** any other function may still need the bag. With growing size or ** complexity of the application it gets harder to obtain this knowledge. ** ** {\Gasman} is cooperating, because this is a requirement for compaction ** and generations (at least without cooperation, compaction and generations ** are very difficult). As described below, the former is important for ** storage efficiency, the latter for time efficiency. Note that the ** cooperation requires only local knowledge, i.e., whether or not a certain ** function of the application follows the two rules can be decided by just ** looking at the function without any knowledge about the rest of the ** application. ** ** {\Gasman} is compacting, because this allows the efficient usage of the ** available storage by applications where the ratio between the size of the ** smallest and the largest bag is large. Namely with a non-compacting ** storage manager, a part of the free storage may become unavailable ** because it is fragmented into many small pieces, each of which is too ** small to serve an allocation. ** ** {\Gasman} is generational, because this makes it very much faster, at ** least for those applications where most bags will indeed die young. ** Namely a non-generational storage manager must test for each bag whether ** or not it is still live during each garbage collection. However with ** many applications the probability that an old bag, i.e., one that ** survived at least one garbage collection, will also survive the next ** garbage collection is high. A generational storage manager simply ** assumes that each old bag is still live during most garbage collections. ** Thereby it avoids the expensive tests for most bags during most garbage ** collections. ** ** {\Gasman} is conservative, because for most applications only few bags ** are incorrectly assumed to be still live and the additional cooperation ** required to make {\Gasman} (more) precise would slow down the ** application. Note that the problem appears since the C compiler provides ** not enough information to distinguish between true references to bags and ** other values that just happen to look like references. Thus {\Gasman} ** has to assume that everything that could be interpreted as a reference to ** a bag is indeed such a reference, and that this bag is still live. ** Therefore some bags may be kept by {\Gasman}, even though they are ** already dead.
*/
// fill in the data "behind" bags struct OpaqueBag { void * body;
};
GAP_STATIC_ASSERT(sizeof(void *) == sizeof(struct OpaqueBag), "sizeof(OpaqueBag) is wrong");
/**************************************************************************** ** *F WORDS_BAG( <size> ) . . . . . . . . . . words used by a bag of given size ** ** The structure of a bag is a follows{\:} ** ** <identifier> ** __/ ** / ** V ** +---------+ ** |<masterp>| ** +---------+ ** \________________________ ** \ ** V ** +------+-------+------+---------+--------------------------------+----+ ** |<size>|<flags>|<type>| <link> | . . ...| pad| ** +------+-------+------+---------+--------------------------------+----+ ** ** A bag consists of a masterpointer, and a body. ** ** The *masterpointer* is a pointer to the data area of the bag. During a ** garbage collection the masterpointer is the only active pointer to the ** data area of the bag, because of the rule that no pointers to or into the ** data area of a bag may be remembered over calls to functions that may ** cause a garbage collection. It is the job of the garbage collection to ** update the masterpointer of a bag when it moves the bag. ** ** The *identifier* of the bag is a pointer to (the address of) the ** masterpointer of the bag. Thus 'PTR_BAG(<bag>)' is simply '\*<bag>' ** plus a cast. ** ** The *body* of a bag consists of a header, the data area, and the padding. ** ** The header in turn consists of the *type byte*, *flags byte* and the ** *size field*, which is either 32 bits or 48 bits (on 32 resp. 64 bit systems), ** followed by a link word. The 'BagHeader' struct describes the exact ** structure of the header. ** ** The *link word* usually contains the identifier of the bag, i.e., a ** pointer to the masterpointer of the bag. Thus the garbage collection can ** find the masterpointer of a bag through the link word if it knows the ** address of the data area of the bag. The link word is also used by ** {\Gasman} to keep bags on two linked lists (see "ChangedBags" and ** "MarkedBags"). ** ** The *data area* of a bag is the area that contains the data stored by ** the application in this bag. ** ** The *padding* consists of up to 'sizeof(Bag)-1' bytes and pads the body ** so that the size of a body is always a multiple of 'sizeof(Bag)'. This ** is to ensure that bags are always aligned. The macro 'WORDS_BAG(<size>)' ** returns the number of words occupied by the data area and padding of a ** bag of size <size>. ** ** A body in the workspace whose type byte contains the value T_DUMMY is the ** remainder of a 'ResizeBag'. That is it consists either of the unused words ** after a bag has been shrunk, or of the old body of the bag after the ** contents of the body have been copied elsewhere for an extension. The ** size field in the bag header contains the number of bytes in ** this area excluding the first word itself. Note that such a body has no ** link word, because such a remainder does not correspond to a bag (see ** "Implementation of ResizeBag"). ** ** A masterpointer with a value congruent to 1 mod 4 is the relic of an ** object that was weakly but not strongly marked in a recent garbage ** collection. These persist until after the next full garbage collection ** by which time all references to them should have been removed. **
*/
enum {
T_DUMMY = NUM_TYPES - 1,
};
// BAG_SLACK is used to define a block of empty space at the end of each // bag, which can then be marked as "not accessible" in the memory checker // Valgrind
enum { BAG_SLACK = 0 };
// TIGHT_WORDS_BAG defines the actual amount of space a Bag requires, // without BAG_SLACK. staticinline UInt TIGHT_WORDS_BAG(UInt size)
{ return (size + sizeof(Bag) - 1) / sizeof(Bag);
}
/**************************************************************************** ** *V MptrBags . . . . . . . . . . . . . . beginning of the masterpointer area *V MptrEndBags . . . . . . . . . . . . . . . end of the masterpointer area *V OldBags . . . . . . . . . . . . . . . . . beginning of the old bags area *V YoungBags . . . . . . . . . . . . . . . beginning of the young bags area *V AllocBags . . . . . . . . . . . . . . . beginning of the allocation area *V AllocSizeBags . . . . . . . . . . . . . . . . size of the allocation area *V EndBags . . . . . . . . . . . . . . . . . . . . . . end of the workspace ** ** {\Gasman} manages one large block of storage called the *workspace*. The ** layout of the workspace is as follows{\:} ** ** +----------------+----------+----------+-----------------+--------------+ ** | masterpointer | unused | old bags | young bags | allocation | ** | area | area | area | area | area | ** +----------------+----------+----------+-----------------+--------------+ ** ^ ^ ^ ^ ^ ^ ** MptrBags MptrEndBags OldBags YoungBags AllocBags EndBags ** ** The *masterpointer area* contains all the masterpointers of the bags. ** 'MptrBags' points to the beginning of this area and 'MptrEndBags' to the ** end. ** ** Between MptrEndBags and OldBags is an *unused area*. This exists so the ** master points, and bags area, can be moved independently. MptrEndBags ** will always come earlier in memory than OldBags. GASMAN should not touch ** this memory, as it may be used for other purposes. ** ** The *old bags area* contains the bodies of all the bags that survived at ** least one garbage collection. This area is only scanned for dead bags ** during a full garbage collection. 'OldBags' points to the beginning of ** this area and 'YoungBags' to the end. ** ** The *young bags area* contains the bodies of all the bags that have been ** allocated since the last garbage collection. This area is scanned for ** dead bags during each garbage collection. 'YoungBags' points to the ** beginning of this area and 'AllocBags' to the end. ** ** The *allocation area* is the storage that is available for allocation of ** new bags. When a new bag is allocated the storage for the body is taken ** from the beginning of this area, and this area is correspondingly ** reduced. If the body does not fit in the allocation area a garbage ** collection is performed. 'AllocBags' points to the beginning of this ** area and 'EndBags' to the end. ** ** Note that the borders between the areas are not static. In particular ** each allocation increases the size of the young bags area and reduces the ** size of the allocation area. On the other hand each garbage collection ** empties the young bags area.
*/ static Bag MptrBags; static Bag MptrEndBags; static Bag * OldBags;
Bag * YoungBags; // exported for CHANGED_BAG static Bag * AllocBags; static UInt AllocSizeBags; static Bag * EndBags;
UInt MASTER_POINTER_NUMBER(Bag bag)
{ if (bag >= MptrBags && bag < MptrEndBags) { return bag - MptrBags + 1;
} return 0;
}
/* These macros, are (a) for more readable code, but more importantly (b) to ensure that unsigned subtracts and divides are used (since we know the ordering of the pointers. This is needed on > 2GB workspaces on 32 but systems. The Size****Area functions return an answer in units of a word (ie sizeof(UInt) bytes), which should
therefore be small enough not to cause problems. */
staticinline UInt SpaceBetweenPointers(constvoid * a, constvoid * b)
{
GAP_ASSERT(b <= a);
UInt res = (((UInt)((UInt)(a) - (UInt)(b))) / sizeof(Bag)); return res;
}
/**************************************************************************** ** *V FreeMptrBags . . . . . . . . . . . . . . . list of free bag identifiers ** ** 'FreeMptrBags' is the first free bag identifier, i.e., it points to the ** first available masterpointer. If 'FreeMptrBags' is 0 there are no ** available masterpointers. The available masterpointers are managed in a ** forward linked list, i.e., each available masterpointer points to the ** next available masterpointer, except for the last, which contains 0. ** ** When a new bag is allocated it gets the identifier 'FreeMptrBags', and ** 'FreeMptrBags' is set to the value stored in this masterpointer, which is ** the next available masterpointer. When a bag is deallocated by a garbage ** collection its masterpointer is added to the list of available ** masterpointers again.
*/ static Bag FreeMptrBags;
/**************************************************************************** ** *V ChangedBags . . . . . . . . . . . . . . . . . . list of changed old bags ** ** 'ChangedBags' holds a list of old bags that have been changed since the ** last garbage collection, i.e., for which either 'CHANGED_BAG' was called ** or which have been resized. ** ** This list starts with the bag whose identifier is 'ChangedBags', and the ** link word of each bag on the list contains the identifier of the next bag ** on the list. The link word of the last bag on the list contains 0. If ** 'ChangedBags' has the value 0, the list is empty. ** ** The garbage collection needs to know which young bags are subbags of old ** bags, since it must not throw those away in a partial garbage ** collection. Only those old bags that have been changed since the last ** garbage collection can contain references to young bags, which have been ** allocated since the last garbage collection. The application cooperates ** by informing {\Gasman} with 'CHANGED_BAG' which bags it has changed. The ** list of changed old bags is scanned by a partial garbage collection and ** the young subbags of the old bags on this list are marked with 'MarkBag' ** (see "MarkedBags"). Without this list 'CollectBags' would have to scan ** all old bags for references to young bags, which would take too much time ** (see "Implementation of CollectBags"). ** ** 'CHANGED_BAG' puts a bag on the list of changed old bags. 'CHANGED_BAG' ** first checks that <bag> is an old bag by checking that 'PTR_BAG( <bag> )' ** is smaller than 'YoungBags'. Then it checks that the bag is not already ** on the list of changed bags by checking that the link word still contains ** the identifier of <bag>. If <bag> is an old bag that is not already on ** the list of changed bags, 'CHANGED_BAG' puts <bag> on the list of changed ** bags, by setting the link word of <bag> to the current value of ** 'ChangedBags' and then setting 'ChangedBags' to <bag>.
*/
Bag ChangedBags;
/**************************************************************************** ** *V MarkedBags . . . . . . . . . . . . . . . . . . . . . list of marked bags ** ** 'MarkedBags' holds a list of bags that have already been marked during a ** garbage collection by 'MarkBag'. This list is only used during garbage ** collections, so it is always empty outside of garbage collections (see ** "Implementation of CollectBags"). ** ** This list starts with the bag whose identifier is 'MarkedBags', and the ** link word of each bag on the list contains the identifier of the next bag ** on the list. The link word of the last bag on the list contains 0. If ** 'MarkedBags' has the value 0, the list is empty. ** ** Note that some other storage managers do not use such a list during the ** mark phase. Instead they simply let the marking functions call each ** other. While this is somewhat simpler it may use an unbound amount of ** space on the stack. This is particularly bad on systems where the stack ** is not in a separate segment of the address space, and thus may grow into ** the workspace, causing disaster. ** ** 'MarkBag' puts a bag <bag> onto this list. 'MarkBag' has to be ** careful, because it can be called with an argument that is not really a ** bag identifier, and may point outside the programs address space. So ** 'MarkBag' first checks that <bag> points to a properly aligned location ** between 'MptrBags' and 'OldBags'. Then 'MarkBag' checks that <bag> is ** the identifier of a young bag by checking that the masterpointer points ** to a location between 'YoungBags' and 'AllocBags' (if <bag> is the ** identifier of an old bag, the masterpointer will point to a location ** between 'OldBags' and 'YoungBags', and if <bag> only appears to be an ** identifier, the masterpointer could be on the free list of masterpointers ** and point to a location between 'MptrBags' and 'OldBags'). Then ** 'MarkBag' checks that <bag> is not already marked by checking that the ** link word of <bag> contains the identifier of the bag. If any of the ** checks fails, 'MarkBag' does nothing. If all checks succeed, 'MarkBag' ** puts <bag> onto the list of marked bags by putting the current value of ** 'ChangedBags' into the link word of <bag> and setting 'ChangedBags' to ** <bag>. Note that since bags are always placed at the front of the list, ** 'CollectBags' will mark the bags in a depth-first order. This is ** probably good to improve the locality of reference.
*/ static Bag MarkedBags;
/**************************************************************************** ** *V NrAllBags . . . . . . . . . . . . . . . . . number of all bags allocated ** ** 'NrAllBags' is the number of bags allocated since Gasman was initialized. ** It is incremented for each 'NewBag' call.
*/ static UInt NrAllBags;
/**************************************************************************** ** *V NrLiveBags . . . . . . . . . . number of bags that survived the last gc ** ** 'NrLiveBags' is the number of bags that were live after the last garbage ** collection. So after a full garbage collection it is simply the number ** of bags that have been found to be still live by this garbage collection. ** After a partial garbage collection it is the sum of the previous value of ** 'NrLiveBags', which is the number of live old bags, and the number of ** bags that have been found to be still live by this garbage collection, ** which is the number of live young bags. This value is used in the ** information messages, and to find out how many free identifiers are ** available.
*/ static UInt NrLiveBags;
/**************************************************************************** ** *V SizeLiveBags . . . . . . . total size of bags that survived the last gc ** ** 'SizeLiveBags' is the total size of bags that were live after the last ** garbage collection. So after a full garbage collection it is simply the ** total size of bags that have been found to be still live by this garbage ** collection. After a partial garbage collection it is the sum of the ** previous value of 'SizeLiveBags', which is the total size of live old ** bags, and the total size of bags that have been found to be still live by ** this garbage collection, which is the total size of live young bags. ** This value is used in the information messages.
*/ static UInt SizeLiveBags;
/**************************************************************************** ** *V NrDeadBags . . . . . . . number of bags that died since the last full gc ** ** 'NrDeadBags' is the number of bags that died since the last full garbage ** collection. So after a full garbage collection this is zero. After a ** partial garbage collection it is the sum of the previous value of ** 'NrDeadBags' and the number of bags that have been found to be dead by ** this garbage collection. This value is used in the information messages.
*/ static UInt NrDeadBags;
/**************************************************************************** ** *V SizeDeadBags . . . . total size of bags that died since the last full gc ** ** 'SizeDeadBags' is the total size of bags that died since the last full ** garbage collection. So after a full garbage collection this is zero. ** After a partial garbage collection it is the sum of the previous value of ** 'SizeDeadBags' and the total size of bags that have been found to be dead ** by this garbage collection. This value is used in the information ** message.
*/ static UInt8 SizeDeadBags;
/**************************************************************************** ** *V NrDeadBags . . . . . . . . . . number of bags only reachable by weak ptrs ** ** 'NrHalfDeadBags' is the number of bags that have been found to be ** reachable only by way of weak pointers since the last garbage collection. ** The bodies of these bags are deleted, but their identifiers are marked so ** that weak pointer objects can recognize this situation.
*/ static UInt NrHalfDeadBags;
/**************************************************************************** ** *F IS_BAG_ID -- check if a value looks like a masterpointer id
*/ staticinlineBOOL IS_BAG_ID(void * ptr)
{ return (((void *)MptrBags <= ptr) && (ptr < (void *)MptrEndBags) &&
((UInt)ptr & (sizeof(Bag) - 1)) == 0);
}
/**************************************************************************** ** *F IS_BAG_BODY -- check if value like a pointer to a bag body
*/ staticinlineBOOL IS_BAG_BODY(void * ptr)
{ return (((void *)OldBags <= ptr) && (ptr < (void *)AllocBags) &&
((UInt)ptr & (sizeof(Bag) - 1)) == 0);
}
// tell valgrind that the masterpointer, bag contents and bag header of Bag // should all be accessible staticvoid CANARY_ALLOW_ACCESS_BAG(Bag bag)
{
VALGRIND_MAKE_MEM_DEFINED(bag, sizeof(Bag)); char * ptr = (char *)PTR_BAG(bag); Int bagLength = SIZE_BAG(bag);
VALGRIND_MAKE_MEM_DEFINED(ptr, bagLength);
// Reverse CANARY_ALL_ACCESS_BAG, making the masterpointer, bag contents and // bag header all inaccessible staticvoid CANARY_FORBID_ACCESS_BAG(Bag bag)
{
VALGRIND_MAKE_MEM_NOACCESS(bag, sizeof(Bag)); char * ptr = (char *)PTR_BAG(bag); Int bagLength = SIZE_BAG(bag);
VALGRIND_MAKE_MEM_NOACCESS(ptr, bagLength);
// Mark all bags as accessible staticvoid CANARY_ALLOW_ACCESS_ALL_BAGS(void)
{
CallbackForAllBags(CANARY_ALLOW_ACCESS_BAG);
}
// Mark all bags as inaccessible staticvoid CANARY_FORBID_ACCESS_ALL_BAGS(void)
{
VALGRIND_MAKE_MEM_NOACCESS(MptrBags, SizeWorkspace * sizeof(Bag));
}
// Temporarily disable valgrind checking. This is used while creating bags or // adjusting any internal GASMAN structures #define CANARY_DISABLE_VALGRIND() VALGRIND_DISABLE_ERROR_REPORTING
void InitSweepFuncBags (
UInt type,
TNumSweepFuncBags sweep_func )
{ if ( TabSweepFuncBags[type] != 0 ) {
Pr("warning: sweep function for type %d already installed\n", type, 0);
}
TabSweepFuncBags[type] = sweep_func;
}
/**************************************************************************** ** *F MarkAllSubBagsDefault(<bag>) . . . marking function that marks everything ** ** 'MarkAllSubBagsDefault' is the same as 'MarkAllSubBags' but is used as ** the initial default marking function. This allows to catch cases where ** 'InitMarkFuncBags' is called twice for the same type: the first time is ** accepted because the marking function is still 'MarkAllSubBagsDefault'; ** the second time raises a warning, because a non-default marking function ** is being replaced.
*/ staticvoid MarkAllSubBagsDefault(Bag bag, void * ref)
{
MarkArrayOfBags(CONST_PTR_BAG(bag), SIZE_BAG(bag) / sizeof(Bag), ref);
}
// We define MarkBag as a inline function here so that // the compiler can optimize the marking functions using it in the // "current translation unit", i.e. inside gasman.c. // Other marking functions don't get to inline MarkBag calls anymore, // but luckily these are rare (and usually not performance critical // to start with). inlinevoid MarkBag(Bag bag, void * ref)
{ if ( IS_BAG_ID(bag)
&& YoungBags < CONST_PTR_BAG(bag) // points to a young bag
&& CONST_PTR_BAG(bag) <= AllocBags // " " " " "
&& (IS_MARKED_DEAD(bag) || IS_MARKED_HALFDEAD(bag)) )
{
LINK_BAG(bag) = MarkedBags;
MarkedBags = bag;
} #ifdef DEBUG_GASMAN_MARKING elseif (!DisableMarkBagValidation) { if (bag != 0 && !((UInt)bag & 3) && !IS_BAG_ID(bag)) {
BadMarksCounter++;
}
} #endif
}
void MarkBagWeakly(Bag bag)
{ if ( IS_BAG_ID(bag)
&& YoungBags < CONST_PTR_BAG(bag) // points to a young bag
&& CONST_PTR_BAG(bag) <= AllocBags // " " " " "
&& IS_MARKED_DEAD(bag) ) // and not marked already
{ // mark it now as we don't have to recurse
LINK_BAG(bag) = MARKED_HALFDEAD(bag);
}
}
/**************************************************************************** ** *F CallbackForAllBags( <func> ) call a C function on all non-zero mptrs ** ** This calls a C function on every bag, including garbage ones, by simply ** walking the masterpointer area. Not terribly safe. **
*/ void CallbackForAllBags(void (*func)(Bag))
{ for (Bag bag = MptrBags; bag < MptrEndBags; bag++) {
CANARY_DISABLE_VALGRIND(); BOOL is_bag = IS_BAG_BODY(bag->body);
CANARY_ENABLE_VALGRIND(); if (is_bag) {
(*func)(bag);
}
}
}
/**************************************************************************** ** *F InitGlobalBag(<addr>, <cookie>) inform Gasman about global bag identifier ** ** 'InitGlobalBag' simply leaves the address <addr> in a global array, where ** it is used by 'CollectBags'. <cookie> is also recorded to allow things to ** be matched up after loading a saved workspace.
*/ static UInt GlobalsAreSorted;
staticvoid ClearGlobalBags(void)
{
UInt i; for (i = 0; i < GlobalBags.nr; i++)
{
GlobalBags.addr[i] = 0;
GlobalBags.cookie[i] = 0;
}
GlobalBags.nr = 0;
GlobalsAreSorted = 0;
}
if ( GlobalBags.nr == NR_GLOBAL_BAGS ) {
Panic("Gasman cannot handle so many global variables");
}
if (cookie == 0) {
Panic("Gasman got a NULL cookie");
}
for (UInt i = 0; i < GlobalBags.nr; i++) { if (streq(GlobalBags.cookie[i], cookie)) { if (GlobalBags.addr[i] == addr)
Pr("Duplicate global bag entry %s\n", (Int)cookie, 0); else
Pr("Duplicate global bag cookie %s\n", (Int)cookie, 0);
}
}
int RegisterBeforeCollectFuncBags(TNumCollectFuncBags func)
{ if (CollectFuncBags.nrBefore >= ARRAY_SIZE(CollectFuncBags.before)) return 1;
CollectFuncBags.before[CollectFuncBags.nrBefore++] = func; return 0;
}
int RegisterAfterCollectFuncBags(TNumCollectFuncBags func)
{ if (CollectFuncBags.nrAfter >= ARRAY_SIZE(CollectFuncBags.after)) return 1;
CollectFuncBags.after[CollectFuncBags.nrAfter++] = func; return 0;
}
/*************************************************************** * GAP_MEM_CHECK * * One of the hardest categories of bugs to fix in GAP are where * a reference to the internals of a GAP object are kept across * a garbage collection (which moves GAP objects around). * * GAP_MEM_CHECK provides a method of detecting such problems, at * the cost of GREATLY decreased performance (Starting GAP in * --enableMemCheck mode takes days, rather than seconds). * * The fundamental idea behind GAP_MEM_CHECK is, whenever NewBag * or ResizeBag is called, then the contents of every Bag in * GAP is moved, and the memory previously being used is marked * as not readable or writable using 'mprotect'. * * Actually copying all GAP's memory space would be extremely * expensive, so instead we use 'mmap' to set up a set of copies * of the GAP memory space, which are represented by the same * underlying physical memory. * * The 0th such copy (which we also ensure is the one at the * lowest memory address) is special -- this is where we * reference the master pointers (which can't move). We do not * 'mprotect' any of this memory. * * Every time we call 'NewBag' or 'ResizeBag', we change which * copy of the GASMAN memory space the master pointers point * to, disabling access to the previous copy, and enabling access * to the new one. * * We never use the master pointers in any copy other than the * 0th, and we never refer to the Bag area in the 0th copy. However, * it simplifies things to not try to separate the master pointer * and Bag areas, because the master pointer area can grow as GAP * runs. * * Because this code is VERY slow, it can be turned on and off. * At run time, call GASMAN_MEM_CHECK(1) to enable, and * GASMAN_MEM_CHECK(0) to disable. Start GAP with --enableMemCheck * to enable from when GAP starts.
*/
#ifdef GAP_MEM_CHECK
Int EnableMemCheck = 0;
int enableMemCheck(constchar * argv[], void * dummy)
{
fputs("# Warning: --enableMemCheck causes SEVERE slowdowns. Starting GAP may take several days!\n", stderr);
EnableMemCheck = 1; return 1;
}
staticvoid MoveBagMemory(char * oldbase, char * newbase)
{ Int moveSize = (newbase - oldbase) / sizeof(Bag); // update the masterpointers for (Bag bag = MptrBags; bag < MptrEndBags; bag++) { if (MptrEndBags <= (Bag)bag->body)
bag->body = (Bag)bag->body + moveSize;
}
Int newBase = oldBase + 1; // Memory buffer 0 is special, as we use that // copy for the master pointers. Therefore never // block access to it, and skip it when cycling. if (newBase >= GetMembufCount())
newBase = 1;
// call the before functions (if any)
UInt i; for (i = 0; i < CollectFuncBags.nrBefore; ++i)
CollectFuncBags.before[i]();
// Enable access to new memory
mprotect(GetMembuf(newBase), GetMembufSize(), PROT_READ | PROT_WRITE); // Block access to old memory (except block 0, which will only occur // on the first call). if (oldBase != 0) {
mprotect(GetMembuf(oldBase), GetMembufSize(), PROT_NONE);
}
// call the after functions (if any) for (i = 0; i < CollectFuncBags.nrAfter; ++i)
CollectFuncBags.after[i]();
oldBase = newBase;
}
#endif
/**************************************************************************** ** *F ReportOutOfMemory_ ** ** The '1 << 27' check ensures we only try to recover when a large ** allocation fails.
*/ staticvoid ReportOutOfMemory_(int line, UInt size, constchar * reason)
{ // if a small allocation failed, we are in serious trouble if (size <= (1 << 27))
Panic_(__FILE__, line, "%s", reason);
// If a SyStorOverrun message is scheduled, mark it as already printed if (SyStorOverrun == SY_STOR_OVERRUN_TO_REPORT)
SyStorOverrun = SY_STOR_OVERRUN_REPORTED;
ErrorMayQuit("Cannot allocate %d bytes: %s", size, (Int)reason);
}
/**************************************************************************** ** *F ReportOutOfMemory(size, reason) . . . . . . handle running out of memory ** ** `ReportOutOfMemory' is called by gasman when a too large allocation ** occurs. It is implemented a macro so we can get the line of the caller. ** The actual work is done by `ReportOutOfMemory_`.
*/ #define ReportOutOfMemory(size, reason) \
ReportOutOfMemory_(__LINE__, size, reason)
/**************************************************************************** ** *F FinishBags() . . . . . . . . . . . . . . . . . . . . . . .finalize GASMAN ** ** `FinishBags()' ends GASMAN and returns all memory to the OS pool **
*/
void FinishBags( void )
{
SyFreeAllBags();
}
/**************************************************************************** ** *F InitBags(...) . . . . . . . . . . . . . . . . . . . . . initialize Gasman ** ** 'InitBags' stores <stack-bottom> in a global variable. It also allocates ** the initial workspace, and sets up the linked list of available ** masterpointers.
*/ static Bag * StackBottomBags;
void InitBags(UInt initial_size, Bag * stack_bottom)
{ // fix max if it is lower than min if (SyStorMax != 0 && SyStorMax < SyStorMin) {
SyStorMax = SyStorMin;
}
// fix pool size if larger than SyStorKill if (SyStorKill != 0 && SyAllocPool != 0 &&
SyAllocPool > 1024 * SyStorKill) {
SyAllocPool = SyStorKill * 1024;
}
ClearGlobalBags();
// install the allocator and the abort function
ExtraMarkFuncBags = 0;
// install the stack values
StackBottomBags = stack_bottom;
// first get some storage from the operating system
initial_size = (initial_size + 511) & ~(511);
MptrBags = SyAllocBags( initial_size, 1 );
GAP_ASSERT(MptrBags);
EndBags = (Bag *)MptrBags + 1024 * (initial_size / sizeof(Bag *));
// In GAP_MEM_CHECK we want as few master pointers as possible, as we // have to loop over them very frequently. #ifdef GAP_MEM_CHECK
UInt initialBagCount = 100000; #else
UInt initialBagCount = 1024*initial_size/8/sizeof(Bag*); #endif // 1/8th of the storage goes into the masterpointer area
FreeMptrBags = MptrBags; for (Bag bag = MptrBags; bag + 1 < MptrBags + initialBagCount; bag++) {
bag->body = bag + 1;
}
// the rest is for bags
MptrEndBags = MptrBags + initialBagCount; // Add a small gap between the end of the master pointers and OldBags // This is mainly here to ensure we do not break allowing OldBags and // MptrEndBags to differ.
OldBags = (Bag *)MptrEndBags + 10;
YoungBags = OldBags;
AllocBags = OldBags;
AllocSizeBags = 256;
// install the marking functions for (UInt i = 0; i < NUM_TYPES; i++)
TabMarkFuncBags[i] = MarkAllSubBagsDefault;
// Set ChangedBags to a proper initial value
ChangedBags = 0;
/**************************************************************************** ** *F NewBag( <type>, <size> ) . . . . . . . . . . . . . . allocate a new bag ** ** 'NewBag' is actually quite simple. ** ** It first tests whether enough storage is available in the allocation area ** and whether a free masterpointer is available. If not, it starts a ** garbage collection by calling 'CollectBags' passing <size> as the size of ** the bag it is currently allocating and 0 to indicate that only a partial ** garbage collection is called for. If 'CollectBags' fails and returns 0, ** 'NewBag' also fails and also returns 0. ** ** Then it takes the first free masterpointer from the linked list of free ** masterpointers (see "FreeMptrBags"). ** ** Then it writes the size and the type into the word pointed to by ** 'AllocBags'. Then it writes the identifier, i.e., the location of the ** masterpointer, into the next word. ** ** Then it advances 'AllocBags' by '2 + WORDS_BAG(<size>)'. ** ** Finally it returns the identifier of the new bag. ** ** All entries of the new bag will be initialized to 0. ** ** If {\Gasman} was compiled with the option 'COUNT_BAGS' then 'NewBag' also ** updates the information in 'InfoBags' (see "InfoBags"). ** ** 'NewBag' is implemented as a function instead of a macro for three ** reasons. It reduces the size of the program, improving the instruction ** cache hit ratio. The compiler can do anti-aliasing analysis for the ** local variables of the function. To enable statistics only {\Gasman} ** needs to be recompiled.
*/
Bag NewBag (
UInt type,
UInt size )
{
Bag bag; // identifier of the new bag
#ifdef GAP_MEM_CHECK
MaybeMoveBags(); #endif
#ifdef TREMBLE_HEAP
CollectBags(0,0); #endif
GAP_ASSERT(!InWorkspaceRestore());
CANARY_DISABLE_VALGRIND();
// check that a masterpointer and enough storage are available if ( (FreeMptrBags == 0 || SizeAllocationArea < WORDS_BAG(sizeof(BagHeader)+size))
&& CollectBags( size, 0 ) == 0 )
{
ReportOutOfMemory(size, "cannot extend the workspace any more!!");
}
// get the identifier of the bag and set 'FreeMptrBags' to the next
bag = FreeMptrBags;
FreeMptrBags = *(Bag*)bag;
// allocate the storage for the bag
BagHeader * header = (BagHeader *)AllocBags;
AllocBags = DATA(header) + WORDS_BAG(size);
// enter bag header
header->type = type;
header->flags = 0;
header->size = size;
// enter link word
header->link = bag;
// set the masterpointer
SET_PTR_BAG(bag, DATA(header));
CANARY_ALLOW_ACCESS_BAG(bag);
GAP_ASSERT(SanityCheckGasmanPointers());
CANARY_ENABLE_VALGRIND();
// return the identifier of the new bag return bag;
}
/**************************************************************************** ** *F RetypeBag(<bag>,<new>) . . . . . . . . . . . . change the type of a bag ** ** 'RetypeBag' is very simple. ** ** All it has to do is to change type word of the bag. ** ** If {\Gasman} was compiled with the option 'COUNT_BAGS' then 'RetypeBag' ** also updates the information in 'InfoBags' (see "InfoBags").
*/ void RetypeBagIntern(Bag bag, UInt new_type)
{
BagHeader * header = BAG_HEADER(bag);
UInt old_type = header->type;
// exit early if nothing is to be done if (old_type == new_type) return;
#ifdef COUNT_BAGS // update the statistics
{
UInt size;
/**************************************************************************** ** *F ResizeBag(<bag>,<new>) . . . . . . . . . . . . change the size of a bag ** ** Basically 'ResizeBag' is rather simple, but there are a few traps that ** must be avoided. ** ** If the size of the bag changes only a little bit, so that the number of ** words needed for the data area does not change, 'ResizeBag' only changes ** the size word of the bag. ** ** If the bag is to be shrunk and at least one word becomes free, ** 'ResizeBag' changes the size word of the bag, and stores a magic ** size-type word in the first free word. This magic size-type word has ** type T_DUMMY and the size is the number of following free bytes, which is ** always divisible by 'sizeof(Bag)'. The type T_DUMMY allows 'CollectBags' ** to detect that this body is the remainder of a resize operation, and the ** size allows it to know how many bytes there are in this body (see ** "Implementation of CollectBags"). ** ** So for example if 'ResizeBag' shrinks a bag of type 7 from 18 bytes to 10 ** bytes the situation before 'ResizeBag' is as follows{\:} ** ** +---------+ ** |<masterp>| ** +---------+ ** \_____________ ** \ ** V ** +---------+---------+--------------------------------------------+----+ ** | 18 . 7 | <link> | . . . . | pad| ** +---------+---------+--------------------------------------------+----+ ** ** And after 'ResizeBag' the situation is as follows{\:} ** ** +---------+ ** |<masterp>| ** +---------+ ** \_____________ ** \ ** V ** +---------+---------+------------------------+----+-------------+-----+ ** | 10 . 7 | <link> | . . | pad| 4 . T_DUMMY | | ** +---------+---------+------------------------+----+-------------+-----+ ** ** If the bag is to be extended and it is that last allocated bag, so that ** it is immediately adjacent to the allocation area, 'ResizeBag' simply ** increments 'AllocBags' after making sure that enough space is available ** in the allocation area (see "Layout of the Workspace"). ** ** If the bag is to be extended and it is not the last allocated bag, ** 'ResizeBag' first allocates a new bag similar to 'NewBag', but without ** using a new masterpointer. Then it copies the old contents to the new ** bag. Finally it resets the masterpointer of the bag to point to the new ** address. Then it changes the type of the old body to T_DUMMY, so that the ** garbage collection can detect that this body is the remainder of a resize ** (see "Implementation of NewBag" and "Implementation of CollectBags"). ** ** When an old bag is extended, it will now reside in the young bags area, ** and thus appear to be young. Since old bags are supposed to survive ** partial garbage collections 'ResizeBag' must somehow protect this bag ** from partial garbage collections. This is done by putting this bag onto ** the linked list of changed bags (see "ChangedBags"). When a partial ** garbage collection sees a young bag on the list of changed bags, it knows ** that it is the result of 'ResizeBag' of an old bag, and does not throw it ** away (see "Implementation of CollectBags"). Note that when 'ResizeBag' ** tries this, the bag may already be on the linked list, either because it ** has been resized earlier, or because it has been changed. In this case ** 'ResizeBag' simply keeps the bag on this linked list. ** ** If {\Gasman} was compiled with the option 'COUNT_BAGS' then 'ResizeBag' ** also updates the information in 'InfoBags' (see "InfoBags").
*/
UInt ResizeBag (
Bag bag,
UInt new_size )
{
// if the real size of the bag doesn't change, not much needs to be done if ( diff == 0 ) {
header->size = new_size;
}
// if the bag is shrunk we insert a magic marker into the heap // Note: if the bag is the last bag, we could in theory also shrink it // by moving 'AllocBags', however this is not correct as the "freed" // memory may not be zero filled, and zeroing it out would cost us elseif ( diff < 0 ) {
// leave magic size-type word for the sweeper, type must be T_DUMMY
BagHeader * freeHeader = (BagHeader *)(DATA(header) + WORDS_BAG(new_size));
freeHeader->type = T_DUMMY; if ( diff == -1 ) { // if there is only one free word, avoid setting the size in // the header: there is no space for it on 32bit systems; // instead set flags to 1 to inform the sweeper.
freeHeader->flags = 1;
} else {
freeHeader->flags = 0;
freeHeader->size = (-diff-1)*sizeof(Bag);
}
header->size = new_size;
}
// if the last bag is enlarged ... elseif (CONST_PTR_BAG(bag) + WORDS_BAG(old_size) == AllocBags) { // check that enough storage for the new bag is available if (SpaceBetweenPointers(EndBags, CONST_PTR_BAG(bag)) < WORDS_BAG(new_size)
&& CollectBags( new_size-old_size, 0 ) == 0 ) {
ReportOutOfMemory(new_size, "cannot extend the workspace any more!!!");
}
// update header pointer in case bag moved
header = BAG_HEADER(bag);
// simply increase the free pointer if ( YoungBags == AllocBags )
YoungBags += diff;
AllocBags += diff;
// and increase the total amount allocated by the difference #ifdef COUNT_BAGS
InfoBags[type].sizeAll += new_size - old_size; #endif
SizeAllBags += new_size - old_size;
header->size = new_size;
}
// if the bag is enlarged ... else {
// check that enough storage for the new bag is available if ( SizeAllocationArea < WORDS_BAG(sizeof(BagHeader)+new_size)
&& CollectBags( new_size, 0 ) == 0 ) {
ReportOutOfMemory(new_size, "cannot extend the workspace any more!!!!");
}
// update header pointer in case bag moved
header = BAG_HEADER(bag);
// leave magic size-type word for the sweeper, type must be T_DUMMY
header->type = T_DUMMY;
header->flags = 0;
header->size = sizeof(BagHeader) + (TIGHT_WORDS_BAG(old_size) - 1) * sizeof(Bag);
// allocate the storage for the bag
BagHeader * newHeader = (BagHeader *)AllocBags;
AllocBags = DATA(newHeader) + WORDS_BAG(new_size);
CANARY_DISABLE_VALGRIND(); // if the bag is already on the changed bags list, keep it there if ( header->link != bag ) {
newHeader->link = header->link;
}
// if the bag is old, put it onto the changed bags list elseif (CONST_PTR_BAG(bag) <= YoungBags) {
newHeader->link = ChangedBags;
ChangedBags = bag;
}
// if the bag is young, enter the normal link word else {
newHeader->link = bag;
}
CANARY_ENABLE_VALGRIND();
// set the masterpointer
Bag * dst = DATA(newHeader);
SET_PTR_BAG(bag, dst);
// copy the contents of the bag
SyMemmove((void *)dst, (void *)DATA(header), sizeof(Obj) * WORDS_BAG(old_size));
}
/**************************************************************************** ** *F CollectBags( <size>, <full> ) . . . . . . . . . . . . . collect dead bags ** ** 'CollectBags' is the function that does most of the work of {\Gasman}. ** ** A partial garbage collection where every bag is young is clearly a full ** garbage collection. So to perform a full garbage collection, ** 'CollectBags' first sets 'YoungBags' to 'OldBags', making every bag ** young, and empties the list of changed old bags, since there are no old ** bags anymore, there can be no changed old bags anymore. So from now on ** we can assume that 'CollectBags' is doing a partial garbage ** collection. In addition, the values 'NewWeakDeadBagMarker' and ** 'OldWeakDeadBagMarker' are exchanged, so that bag identifiers that have ** been halfdead since before this full garbage collection can be ** distinguished from those which have died on this pass. ** ** Garbage collection is performed in three phases. The mark phase, the ** sweep phase, and the check phase. ** ** In the *mark phase*, 'CollectBags' finds all young bags that are still ** live and builds a linked list of those bags (see "MarkedBags"). A bag is ** put on this list of marked bags by applying 'MarkBag' to its ** identifier. Note that 'MarkBag' checks that a bag is not already on the ** list of marked bags, before it puts it on the list, so no bag can be put ** twice on this list. ** ** First, 'CollectBags' marks all young bags that are directly accessible ** through global variables, i.e., it marks those young bags whose ** identifiers appear in global variables. It does this by applying ** 'MarkBag' to the values at the addresses of global variables that may ** hold bag identifiers provided by 'InitGlobalBag' (see "InitGlobalBag"). ** ** Next, 'CollectBags' marks all young bags that are directly accessible ** through local variables, i.e., it marks those young bags whose ** identifiers appear in the stack. It does this by calling the stack ** marking function <stack-func> (see "InitBags"). The generic stack ** marking function, which is called if <stack-func> (see "InitBags") was 0, ** is described below. The problem is that there is usually not sufficient ** information available to decide if a value on the stack is really the ** identifier of a bag, or is a value of another type that only appears to ** be the identifier of a bag. The position usually taken by the stack ** marking function is that everything on the stack that could possibly be ** interpreted as the identifier of a bag is an identifier of a bag, and ** that this bag is therefore live. This position is what makes {\Gasman} a ** conservative storage manager. ** ** The generic stack marking function 'GenStackFuncBags', which is called if ** <stack-func> (see "InitBags") was 0, works by applying 'MarkBag' to all ** the values on the stack, which is supposed to extend from <stack-start> ** (see "InitBags") to the address of a local variable of the function. ** Note that some local variables may not be stored on the stack, because ** they are still in the processors registers. 'GenStackFuncBags' uses a ** jump buffer 'RegsBags', filled by the C library function 'setjmp', marking ** all bags whose identifiers appear in 'RegsBags'. This is a dirty hack, ** that need not work, but actually works on a surprisingly large number of ** machines. But it will not work on Sun Sparc machines, which have larger ** register files, of which only the part visible to the current function ** will be saved by 'setjmp'. For those machines 'GenStackFuncBags' first ** calls the operating system to flush the whole register file. Note that a ** compiler may save a register somewhere else if it wants to use this ** register for something else. Usually this register is saved further up ** the stack, i.e., beyond the address of the local variable, and ** 'GenStackFuncBags' would not see this value any more. To deal with this ** problem, 'setjmp' must be called *before* 'GenStackFuncBags' is entered, ** i.e., before the registers may have been saved elsewhere. Thus it is ** called from 'CollectBags'. ** ** Next 'CollectBags' marks all young bags that are directly accessible from ** old bags, i.e., it marks all young bags whose identifiers appear in the ** data areas of old bags. It does this by applying 'MarkBag' to each ** identifier appearing in changed old bags, i.e., in those bags that appear ** on the list of changed old bags (see "ChangedBags"). To be more precise ** it calls the marking function for the appropriate type to each changed ** old bag (see "InitMarkFuncBags"). It need not apply the marking function ** to each old bag, because old bags that have not been changed since the ** last garbage collection cannot contain identifiers of young bags, which
--> --------------------
--> maximum size reached
--> --------------------
¤ Dauer der Verarbeitung: 0.33 Sekunden
(vorverarbeitet)
¤
Die Informationen auf dieser Webseite wurden
nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit,
noch Qualität der bereit gestellten Informationen zugesichert.
Bemerkung:
Die farbliche Syntaxdarstellung ist noch experimentell.