Ziele Untersuchung
mit Columbo Integrität von
Datenbanken Interaktion und
Portierbarkeit Ergonomie der
Schnittstellen

Angebot Produkte Projekt Beratung

Mittel Analytik Modellierung Sprachen Algebra Logik Hardware Denken Kreativität

Zusammenhänge Gesellschaft Wirtschaft Branche Firma


products/sources/formale Sprachen/C/Linux/mm/ (Open Source Betriebssystem Version 6.17.9^©) Datei vom 24.10.2025 mit Größe 199 kB

Quelle slub.c Sprache: C

// SPDX-License-Identifier: GPL-2.0
/*
* SLUB: A slab allocator that limits cache line use instead of queuing
* objects in per cpu and per node lists.
*
* The allocator synchronizes using per slab locks or atomic operations
* and only uses a centralized lock to manage a pool of partial slabs.
*
* (C) 2007 SGI, Christoph Lameter
* (C) 2011 Linux Foundation, Christoph Lameter
*/

#include <linux/mm.h>
#include <linux/swap.h> /* mm_account_reclaimed_pages() */
#include <linux/module.h>
#include <linux/bit_spinlock.h>
#include <linux/interrupt.h>
#include <linux/swab.h>
#include <linux/bitops.h>
#include <linux/slab.h>
#include "slab.h"
#include <linux/vmalloc.h>
#include <linux/proc_fs.h>
#include <linux/seq_file.h>
#include <linux/kasan.h>
#include <linux/node.h>
#include <linux/kmsan.h>
#include <linux/cpu.h>
#include <linux/cpuset.h>
#include <linux/mempolicy.h>
#include <linux/ctype.h>
#include <linux/stackdepot.h>
#include <linux/debugobjects.h>
#include <linux/kallsyms.h>
#include <linux/kfence.h>
#include <linux/memory.h>
#include <linux/math64.h>
#include <linux/fault-inject.h>
#include <linux/kmemleak.h>
#include <linux/stacktrace.h>
#include <linux/prefetch.h>
#include <linux/memcontrol.h>
#include <linux/random.h>
#include <kunit/test.h>
#include <kunit/test-bug.h>
#include <linux/sort.h>

#include <linux/debugfs.h>
#include <trace/events/kmem.h>

#include "internal.h"

/*
* Lock order:
*   1. slab_mutex (Global Mutex)
*   2. node->list_lock (Spinlock)
*   3. kmem_cache->cpu_slab->lock (Local lock)
*   4. slab_lock(slab) (Only on some arches)
*   5. object_map_lock (Only for debugging)
*
*   slab_mutex
*
*   The role of the slab_mutex is to protect the list of all the slabs
*   and to synchronize major metadata changes to slab cache structures.
*   Also synchronizes memory hotplug callbacks.
*
*   slab_lock
*
*   The slab_lock is a wrapper around the page lock, thus it is a bit
*   spinlock.
*
*   The slab_lock is only used on arches that do not have the ability
*   to do a cmpxchg_double. It only protects:
*
* A. slab->freelist -> List of free objects in a slab
* B. slab->inuse -> Number of objects in use
* C. slab->objects -> Number of objects in slab
* D. slab->frozen -> frozen state
*
*   Frozen slabs
*
*   If a slab is frozen then it is exempt from list management. It is
*   the cpu slab which is actively allocated from by the processor that
*   froze it and it is not on any list. The processor that froze the
*   slab is the one who can perform list operations on the slab. Other
*   processors may put objects onto the freelist but the processor that
*   froze the slab is the only one that can retrieve the objects from the
*   slab's freelist.
*
*   CPU partial slabs
*
*   The partially empty slabs cached on the CPU partial list are used
*   for performance reasons, which speeds up the allocation process.
*   These slabs are not frozen, but are also exempt from list management,
*   by clearing the SL_partial flag when moving out of the node
*   partial list. Please see __slab_free() for more details.
*
*   To sum up, the current scheme is:
*   - node partial slab: SL_partial && !frozen
*   - cpu partial slab: !SL_partial && !frozen
*   - cpu slab: !SL_partial && frozen
*   - full slab: !SL_partial && !frozen
*
*   list_lock
*
*   The list_lock protects the partial and full list on each node and
*   the partial slab counter. If taken then no new slabs may be added or
*   removed from the lists nor make the number of partial slabs be modified.
*   (Note that the total number of slabs is an atomic value that may be
*   modified without taking the list lock).
*
*   The list_lock is a centralized lock and thus we avoid taking it as
*   much as possible. As long as SLUB does not have to handle partial
*   slabs, operations can continue without any centralized lock. F.e.
*   allocating a long series of objects that fill up slabs does not require
*   the list lock.
*
*   For debug caches, all allocations are forced to go through a list_lock
*   protected region to serialize against concurrent validation.
*
*   cpu_slab->lock local lock
*
*   This locks protect slowpath manipulation of all kmem_cache_cpu fields
*   except the stat counters. This is a percpu structure manipulated only by
*   the local cpu, so the lock protects against being preempted or interrupted
*   by an irq. Fast path operations rely on lockless operations instead.
*
*   On PREEMPT_RT, the local lock neither disables interrupts nor preemption
*   which means the lockless fastpath cannot be used as it might interfere with
*   an in-progress slow path operations. In this case the local lock is always
*   taken but it still utilizes the freelist for the common operations.
*
*   lockless fastpaths
*
*   The fast path allocation (slab_alloc_node()) and freeing (do_slab_free())
*   are fully lockless when satisfied from the percpu slab (and when
*   cmpxchg_double is possible to use, otherwise slab_lock is taken).
*   They also don't disable preemption or migration or irqs. They rely on
*   the transaction id (tid) field to detect being preempted or moved to
*   another cpu.
*
*   irq, preemption, migration considerations
*
*   Interrupts are disabled as part of list_lock or local_lock operations, or
*   around the slab_lock operation, in order to make the slab allocator safe
*   to use in the context of an irq.
*
*   In addition, preemption (or migration on PREEMPT_RT) is disabled in the
*   allocation slowpath, bulk allocation, and put_cpu_partial(), so that the
*   local cpu doesn't change in the process and e.g. the kmem_cache_cpu pointer
*   doesn't have to be revalidated in each section protected by the local lock.
*
* SLUB assigns one slab for allocation to each processor.
* Allocations only occur from these slabs called cpu slabs.
*
* Slabs with free elements are kept on a partial list and during regular
* operations no list for full slabs is used. If an object in a full slab is
* freed then the slab will show up again on the partial lists.
* We track full slabs for debugging purposes though because otherwise we
* cannot scan all objects.
*
* Slabs are freed when they become empty. Teardown and setup is
* minimal so we rely on the page allocators per cpu caches for
* fast frees and allocs.
*
* slab->frozen The slab is frozen and exempt from list processing.
* This means that the slab is dedicated to a purpose
* such as satisfying allocations for a specific
* processor. Objects may be freed in the slab while
* it is frozen but slab_free will then skip the usual
* list operations. It is up to the processor holding
* the slab to integrate the slab into the slab lists
* when the slab is no longer needed.
*
* One use of this flag is to mark slabs that are
* used for allocations. Then such a slab becomes a cpu
* slab. The cpu slab may be equipped with an additional
* freelist that allows lockless access to
* free objects in addition to the regular freelist
* that requires the slab lock.
*
* SLAB_DEBUG_FLAGS Slab requires special handling due to debug
* options set. This moves slab handling out of
* the fast path and disables lockless freelists.
*/

/**
* enum slab_flags - How the slab flags bits are used.
* @SL_locked: Is locked with slab_lock()
* @SL_partial: On the per-node partial list
* @SL_pfmemalloc: Was allocated from PF_MEMALLOC reserves
*
* The slab flags share space with the page flags but some bits have
* different interpretations.  The high bits are used for information
* like zone/node/section.
*/
enum slab_flags {
SL_locked = PG_locked,
SL_partial = PG_workingset, /* Historical reasons for this bit */
SL_pfmemalloc = PG_active, /* Historical reasons for this bit */
};

/*
* We could simply use migrate_disable()/enable() but as long as it's a
* function call even on !PREEMPT_RT, use inline preempt_disable() there.
*/
#ifndef CONFIG_PREEMPT_RT
#define slub_get_cpu_ptr(var)  get_cpu_ptr(var)
#define slub_put_cpu_ptr(var)  put_cpu_ptr(var)
#define USE_LOCKLESS_FAST_PATH() (true)
#else
#define slub_get_cpu_ptr(var)  \
({     \
migrate_disable();  \
this_cpu_ptr(var);  \
})
#define slub_put_cpu_ptr(var)  \
do {     \
(void)(var);   \
migrate_enable();  \
} while (0)
#define USE_LOCKLESS_FAST_PATH() (false)
#endif

#ifndef CONFIG_SLUB_TINY
#define __fastpath_inline __always_inline
#else
#define __fastpath_inline
#endif

#ifdef CONFIG_SLUB_DEBUG
#ifdef CONFIG_SLUB_DEBUG_ON
DEFINE_STATIC_KEY_TRUE(slub_debug_enabled);
#else
DEFINE_STATIC_KEY_FALSE(slub_debug_enabled);
#endif
#endif  /* CONFIG_SLUB_DEBUG */

#ifdef CONFIG_NUMA
static DEFINE_STATIC_KEY_FALSE(strict_numa);
#endif

/* Structure holding parameters for get_partial() call chain */
struct partial_context {
gfp_t flags;
unsigned int orig_size;
void *object;
};

static inline bool kmem_cache_debug(struct kmem_cache *s)
{
return kmem_cache_debug_flags(s, SLAB_DEBUG_FLAGS);
}

void *fixup_red_left(struct kmem_cache *s, void *p)
{
if (kmem_cache_debug_flags(s, SLAB_RED_ZONE))
  p += s->red_left_pad;

return p;
}

static inline bool kmem_cache_has_cpu_partial(struct kmem_cache *s)
{
#ifdef CONFIG_SLUB_CPU_PARTIAL
return !kmem_cache_debug(s);
#else
return false;
#endif
}

/*
* Issues still to be resolved:
*
* - Support PAGE_ALLOC_DEBUG. Should be easy to do.
*
* - Variable sizing of the per node arrays
*/

/* Enable to log cmpxchg failures */
#undef SLUB_DEBUG_CMPXCHG

#ifndef CONFIG_SLUB_TINY
/*
* Minimum number of partial slabs. These will be left on the partial
* lists even if they are empty. kmem_cache_shrink may reclaim them.
*/
#define MIN_PARTIAL 5

/*
* Maximum number of desirable partial slabs.
* The existence of more partial slabs makes kmem_cache_shrink
* sort the partial list by the number of objects in use.
*/
#define MAX_PARTIAL 10
#else
#define MIN_PARTIAL 0
#define MAX_PARTIAL 0
#endif

#define DEBUG_DEFAULT_FLAGS (SLAB_CONSISTENCY_CHECKS | SLAB_RED_ZONE | \
    SLAB_POISON | SLAB_STORE_USER)

/*
* These debug flags cannot use CMPXCHG because there might be consistency
* issues when checking or reading debug information
*/
#define SLAB_NO_CMPXCHG (SLAB_CONSISTENCY_CHECKS | SLAB_STORE_USER | \
    SLAB_TRACE)

/*
* Debugging flags that require metadata to be stored in the slab.  These get
* disabled when slab_debug=O is used and a cache's min order increases with
* metadata.
*/
#define DEBUG_METADATA_FLAGS (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER)

#define OO_SHIFT 16
#define OO_MASK  ((1 << OO_SHIFT) - 1)
#define MAX_OBJS_PER_PAGE 32767 /* since slab.objects is u15 */

/* Internal SLUB flags */
/* Poison object */
#define __OBJECT_POISON  __SLAB_FLAG_BIT(_SLAB_OBJECT_POISON)
/* Use cmpxchg_double */

#ifdef system_has_freelist_aba
#define __CMPXCHG_DOUBLE __SLAB_FLAG_BIT(_SLAB_CMPXCHG_DOUBLE)
#else
#define __CMPXCHG_DOUBLE __SLAB_FLAG_UNUSED
#endif

/*
* Tracking user of a slab.
*/
#define TRACK_ADDRS_COUNT 16
struct track {
unsigned long addr; /* Called from address */
#ifdef CONFIG_STACKDEPOT
depot_stack_handle_t handle;
#endif
int cpu;  /* Was running on cpu */
int pid;  /* Pid context */
unsigned long when; /* When did the operation occur */
};

enum track_item { TRACK_ALLOC, TRACK_FREE };

#ifdef SLAB_SUPPORTS_SYSFS
static int sysfs_slab_add(struct kmem_cache *);
static int sysfs_slab_alias(struct kmem_cache *, const char *);
#else
static inline int sysfs_slab_add(struct kmem_cache *s) { return 0; }
static inline int sysfs_slab_alias(struct kmem_cache *s, const char *p)
       { return 0; }
#endif

#if defined(CONFIG_DEBUG_FS) && defined(CONFIG_SLUB_DEBUG)
static void debugfs_slab_add(struct kmem_cache *);
#else
static inline void debugfs_slab_add(struct kmem_cache *s) { }
#endif

enum stat_item {
ALLOC_FASTPATH,  /* Allocation from cpu slab */
ALLOC_SLOWPATH,  /* Allocation by getting a new cpu slab */
FREE_FASTPATH,  /* Free to cpu slab */
FREE_SLOWPATH,  /* Freeing not to cpu slab */
FREE_FROZEN,  /* Freeing to frozen slab */
FREE_ADD_PARTIAL, /* Freeing moves slab to partial list */
FREE_REMOVE_PARTIAL, /* Freeing removes last object */
ALLOC_FROM_PARTIAL, /* Cpu slab acquired from node partial list */
ALLOC_SLAB,  /* Cpu slab acquired from page allocator */
ALLOC_REFILL,  /* Refill cpu slab from slab freelist */
ALLOC_NODE_MISMATCH, /* Switching cpu slab */
FREE_SLAB,  /* Slab freed to the page allocator */
CPUSLAB_FLUSH,  /* Abandoning of the cpu slab */
DEACTIVATE_FULL, /* Cpu slab was full when deactivated */
DEACTIVATE_EMPTY, /* Cpu slab was empty when deactivated */
DEACTIVATE_TO_HEAD, /* Cpu slab was moved to the head of partials */
DEACTIVATE_TO_TAIL, /* Cpu slab was moved to the tail of partials */
DEACTIVATE_REMOTE_FREES,/* Slab contained remotely freed objects */
DEACTIVATE_BYPASS, /* Implicit deactivation */
ORDER_FALLBACK,  /* Number of times fallback was necessary */
CMPXCHG_DOUBLE_CPU_FAIL,/* Failures of this_cpu_cmpxchg_double */
CMPXCHG_DOUBLE_FAIL, /* Failures of slab freelist update */
CPU_PARTIAL_ALLOC, /* Used cpu partial on alloc */
CPU_PARTIAL_FREE, /* Refill cpu partial on free */
CPU_PARTIAL_NODE, /* Refill cpu partial from node partial */
CPU_PARTIAL_DRAIN, /* Drain cpu partial to node partial */
NR_SLUB_STAT_ITEMS
};

#ifndef CONFIG_SLUB_TINY
/*
* When changing the layout, make sure freelist and tid are still compatible
* with this_cpu_cmpxchg_double() alignment requirements.
*/
struct kmem_cache_cpu {
union {
  struct {
   void **freelist; /* Pointer to next available object */
   unsigned long tid; /* Globally unique transaction id */
  };
  freelist_aba_t freelist_tid;
};
struct slab *slab; /* The slab from which we are allocating */
#ifdef CONFIG_SLUB_CPU_PARTIAL
struct slab *partial; /* Partially allocated slabs */
#endif
local_lock_t lock; /* Protects the fields above */
#ifdef CONFIG_SLUB_STATS
unsigned int stat[NR_SLUB_STAT_ITEMS];
#endif
};
#endif /* CONFIG_SLUB_TINY */

static inline void stat(const struct kmem_cache *s, enum stat_item si)
{
#ifdef CONFIG_SLUB_STATS
/*
* The rmw is racy on a preemptible kernel but this is acceptable, so
* avoid this_cpu_add()'s irq-disable overhead.
*/
raw_cpu_inc(s->cpu_slab->stat[si]);
#endif
}

static inline
void stat_add(const struct kmem_cache *s, enum stat_item si, int v)
{
#ifdef CONFIG_SLUB_STATS
raw_cpu_add(s->cpu_slab->stat[si], v);
#endif
}

/*
* The slab lists for all objects.
*/
struct kmem_cache_node {
spinlock_t list_lock;
unsigned long nr_partial;
struct list_head partial;
#ifdef CONFIG_SLUB_DEBUG
atomic_long_t nr_slabs;
atomic_long_t total_objects;
struct list_head full;
#endif
};

static inline struct kmem_cache_node *get_node(struct kmem_cache *s, int node)
{
return s->node[node];
}

/*
* Iterator over all nodes. The body will be executed for each node that has
* a kmem_cache_node structure allocated (which is true for all online nodes)
*/
#define for_each_kmem_cache_node(__s, __node, __n) \
for (__node = 0; __node < nr_node_ids; __node++) \
   if ((__n = get_node(__s, __node)))

/*
* Tracks for which NUMA nodes we have kmem_cache_nodes allocated.
* Corresponds to node_state[N_MEMORY], but can temporarily
* differ during memory hotplug/hotremove operations.
* Protected by slab_mutex.
*/
static nodemask_t slab_nodes;

#ifndef CONFIG_SLUB_TINY
/*
* Workqueue used for flush_cpu_slab().
*/
static struct workqueue_struct *flushwq;
#endif

/********************************************************************
* Core slab cache functions
*******************************************************************/

/*
* Returns freelist pointer (ptr). With hardening, this is obfuscated
* with an XOR of the address where the pointer is held and a per-cache
* random number.
*/
static inline freeptr_t freelist_ptr_encode(const struct kmem_cache *s,
         void *ptr, unsigned long ptr_addr)
{
unsigned long encoded;

#ifdef CONFIG_SLAB_FREELIST_HARDENED
encoded = (unsigned long)ptr ^ s->random ^ swab(ptr_addr);
#else
encoded = (unsigned long)ptr;
#endif
return (freeptr_t){.v = encoded};
}

static inline void *freelist_ptr_decode(const struct kmem_cache *s,
     freeptr_t ptr, unsigned long ptr_addr)
{
void *decoded;

#ifdef CONFIG_SLAB_FREELIST_HARDENED
decoded = (void *)(ptr.v ^ s->random ^ swab(ptr_addr));
#else
decoded = (void *)ptr.v;
#endif
return decoded;
}

static inline void *get_freepointer(struct kmem_cache *s, void *object)
{
unsigned long ptr_addr;
freeptr_t p;

object = kasan_reset_tag(object);
ptr_addr = (unsigned long)object + s->offset;
p = *(freeptr_t *)(ptr_addr);
return freelist_ptr_decode(s, p, ptr_addr);
}

#ifndef CONFIG_SLUB_TINY
static void prefetch_freepointer(const struct kmem_cache *s, void *object)
{
prefetchw(object + s->offset);
}
#endif

/*
* When running under KMSAN, get_freepointer_safe() may return an uninitialized
* pointer value in the case the current thread loses the race for the next
* memory chunk in the freelist. In that case this_cpu_cmpxchg_double() in
* slab_alloc_node() will fail, so the uninitialized value won't be used, but
* KMSAN will still check all arguments of cmpxchg because of imperfect
* handling of inline assembly.
* To work around this problem, we apply __no_kmsan_checks to ensure that
* get_freepointer_safe() returns initialized memory.
*/
__no_kmsan_checks
static inline void *get_freepointer_safe(struct kmem_cache *s, void *object)
{
unsigned long freepointer_addr;
freeptr_t p;

if (!debug_pagealloc_enabled_static())
  return get_freepointer(s, object);

object = kasan_reset_tag(object);
freepointer_addr = (unsigned long)object + s->offset;
copy_from_kernel_nofault(&p, (freeptr_t *)freepointer_addr, sizeof(p));
return freelist_ptr_decode(s, p, freepointer_addr);
}

static inline void set_freepointer(struct kmem_cache *s, void *object, void *fp)
{
unsigned long freeptr_addr = (unsigned long)object + s->offset;

#ifdef CONFIG_SLAB_FREELIST_HARDENED
BUG_ON(object == fp); /* naive detection of double free or corruption */
#endif

freeptr_addr = (unsigned long)kasan_reset_tag((void *)freeptr_addr);
*(freeptr_t *)freeptr_addr = freelist_ptr_encode(s, fp, freeptr_addr);
}

/*
* See comment in calculate_sizes().
*/
static inline bool freeptr_outside_object(struct kmem_cache *s)
{
return s->offset >= s->inuse;
}

/*
* Return offset of the end of info block which is inuse + free pointer if
* not overlapping with object.
*/
static inline unsigned int get_info_end(struct kmem_cache *s)
{
if (freeptr_outside_object(s))
  return s->inuse + sizeof(void *);
else
  return s->inuse;
}

/* Loop over all objects in a slab */
#define for_each_object(__p, __s, __addr, __objects) \
for (__p = fixup_red_left(__s, __addr); \
  __p < (__addr) + (__objects) * (__s)->size; \
  __p += (__s)->size)

static inline unsigned int order_objects(unsigned int order, unsigned int size)
{
return ((unsigned int)PAGE_SIZE << order) / size;
}

static inline struct kmem_cache_order_objects oo_make(unsigned int order,
  unsigned int size)
{
struct kmem_cache_order_objects x = {
  (order << OO_SHIFT) + order_objects(order, size)
};

return x;
}

static inline unsigned int oo_order(struct kmem_cache_order_objects x)
{
return x.x >> OO_SHIFT;
}

static inline unsigned int oo_objects(struct kmem_cache_order_objects x)
{
return x.x & OO_MASK;
}

#ifdef CONFIG_SLUB_CPU_PARTIAL
static void slub_set_cpu_partial(struct kmem_cache *s, unsigned int nr_objects)
{
unsigned int nr_slabs;

s->cpu_partial = nr_objects;

/*
* We take the number of objects but actually limit the number of
* slabs on the per cpu partial list, in order to limit excessive
* growth of the list. For simplicity we assume that the slabs will
* be half-full.
*/
nr_slabs = DIV_ROUND_UP(nr_objects * 2, oo_objects(s->oo));
s->cpu_partial_slabs = nr_slabs;
}

static inline unsigned int slub_get_cpu_partial(struct kmem_cache *s)
{
return s->cpu_partial_slabs;
}
#else
static inline void
slub_set_cpu_partial(struct kmem_cache *s, unsigned int nr_objects)
{
}

static inline unsigned int slub_get_cpu_partial(struct kmem_cache *s)
{
return 0;
}
#endif /* CONFIG_SLUB_CPU_PARTIAL */

/*
* If network-based swap is enabled, slub must keep track of whether memory
* were allocated from pfmemalloc reserves.
*/
static inline bool slab_test_pfmemalloc(const struct slab *slab)
{
return test_bit(SL_pfmemalloc, &slab->flags);
}

static inline void slab_set_pfmemalloc(struct slab *slab)
{
set_bit(SL_pfmemalloc, &slab->flags);
}

static inline void __slab_clear_pfmemalloc(struct slab *slab)
{
__clear_bit(SL_pfmemalloc, &slab->flags);
}

/*
* Per slab locking using the pagelock
*/
static __always_inline void slab_lock(struct slab *slab)
{
bit_spin_lock(SL_locked, &slab->flags);
}

static __always_inline void slab_unlock(struct slab *slab)
{
bit_spin_unlock(SL_locked, &slab->flags);
}

static inline bool
__update_freelist_fast(struct slab *slab,
        void *freelist_old, unsigned long counters_old,
        void *freelist_new, unsigned long counters_new)
{
#ifdef system_has_freelist_aba
freelist_aba_t old = { .freelist = freelist_old, .counter = counters_old };
freelist_aba_t new = { .freelist = freelist_new, .counter = counters_new };

return try_cmpxchg_freelist(&slab->freelist_counter.full, &old.full, new.full);
#else
return false;
#endif
}

static inline bool
__update_freelist_slow(struct slab *slab,
        void *freelist_old, unsigned long counters_old,
        void *freelist_new, unsigned long counters_new)
{
bool ret = false;

slab_lock(slab);
if (slab->freelist == freelist_old &&
     slab->counters == counters_old) {
  slab->freelist = freelist_new;
  slab->counters = counters_new;
  ret = true;
}
slab_unlock(slab);

return ret;
}

/*
* Interrupts must be disabled (for the fallback code to work right), typically
* by an _irqsave() lock variant. On PREEMPT_RT the preempt_disable(), which is
* part of bit_spin_lock(), is sufficient because the policy is not to allow any
* allocation/ free operation in hardirq context. Therefore nothing can
* interrupt the operation.
*/
static inline bool __slab_update_freelist(struct kmem_cache *s, struct slab *slab,
  void *freelist_old, unsigned long counters_old,
  void *freelist_new, unsigned long counters_new,
  const char *n)
{
bool ret;

if (USE_LOCKLESS_FAST_PATH())
  lockdep_assert_irqs_disabled();

if (s->flags & __CMPXCHG_DOUBLE) {
  ret = __update_freelist_fast(slab, freelist_old, counters_old,
                freelist_new, counters_new);
} else {
  ret = __update_freelist_slow(slab, freelist_old, counters_old,
                freelist_new, counters_new);
}
if (likely(ret))
  return true;

cpu_relax();
stat(s, CMPXCHG_DOUBLE_FAIL);

#ifdef SLUB_DEBUG_CMPXCHG
pr_info("%s %s: cmpxchg double redo ", n, s->name);
#endif

return false;
}

static inline bool slab_update_freelist(struct kmem_cache *s, struct slab *slab,
  void *freelist_old, unsigned long counters_old,
  void *freelist_new, unsigned long counters_new,
  const char *n)
{
bool ret;

if (s->flags & __CMPXCHG_DOUBLE) {
  ret = __update_freelist_fast(slab, freelist_old, counters_old,
                freelist_new, counters_new);
} else {
  unsigned long flags;

  local_irq_save(flags);
  ret = __update_freelist_slow(slab, freelist_old, counters_old,
                freelist_new, counters_new);
  local_irq_restore(flags);
}
if (likely(ret))
  return true;

cpu_relax();
stat(s, CMPXCHG_DOUBLE_FAIL);

#ifdef SLUB_DEBUG_CMPXCHG
pr_info("%s %s: cmpxchg double redo ", n, s->name);
#endif

return false;
}

/*
* kmalloc caches has fixed sizes (mostly power of 2), and kmalloc() API
* family will round up the real request size to these fixed ones, so
* there could be an extra area than what is requested. Save the original
* request size in the meta data area, for better debug and sanity check.
*/
static inline void set_orig_size(struct kmem_cache *s,
    void *object, unsigned int orig_size)
{
void *p = kasan_reset_tag(object);

if (!slub_debug_orig_size(s))
  return;

p += get_info_end(s);
p += sizeof(struct track) * 2;

*(unsigned int *)p = orig_size;
}

static inline unsigned int get_orig_size(struct kmem_cache *s, void *object)
{
void *p = kasan_reset_tag(object);

if (is_kfence_address(object))
  return kfence_ksize(object);

if (!slub_debug_orig_size(s))
  return s->object_size;

p += get_info_end(s);
p += sizeof(struct track) * 2;

return *(unsigned int *)p;
}

#ifdef CONFIG_SLUB_DEBUG
static unsigned long object_map[BITS_TO_LONGS(MAX_OBJS_PER_PAGE)];
static DEFINE_SPINLOCK(object_map_lock);

static void __fill_map(unsigned long *obj_map, struct kmem_cache *s,
         struct slab *slab)
{
void *addr = slab_address(slab);
void *p;

bitmap_zero(obj_map, slab->objects);

for (p = slab->freelist; p; p = get_freepointer(s, p))
  set_bit(__obj_to_index(s, addr, p), obj_map);
}

#if IS_ENABLED(CONFIG_KUNIT)
static bool slab_add_kunit_errors(void)
{
struct kunit_resource *resource;

if (!kunit_get_current_test())
  return false;

resource = kunit_find_named_resource(current->kunit_test, "slab_errors");
if (!resource)
  return false;

(*(int *)resource->data)++;
kunit_put_resource(resource);
return true;
}

bool slab_in_kunit_test(void)
{
struct kunit_resource *resource;

if (!kunit_get_current_test())
  return false;

resource = kunit_find_named_resource(current->kunit_test, "slab_errors");
if (!resource)
  return false;

kunit_put_resource(resource);
return true;
}
#else
static inline bool slab_add_kunit_errors(void) { return false; }
#endif

static inline unsigned int size_from_object(struct kmem_cache *s)
{
if (s->flags & SLAB_RED_ZONE)
  return s->size - s->red_left_pad;

return s->size;
}

static inline void *restore_red_left(struct kmem_cache *s, void *p)
{
if (s->flags & SLAB_RED_ZONE)
  p -= s->red_left_pad;

return p;
}

/*
* Debug settings:
*/
#if defined(CONFIG_SLUB_DEBUG_ON)
static slab_flags_t slub_debug = DEBUG_DEFAULT_FLAGS;
#else
static slab_flags_t slub_debug;
#endif

static char *slub_debug_string;
static int disable_higher_order_debug;

/*
* slub is about to manipulate internal object metadata.  This memory lies
* outside the range of the allocated object, so accessing it would normally
* be reported by kasan as a bounds error.  metadata_access_enable() is used
* to tell kasan that these accesses are OK.
*/
static inline void metadata_access_enable(void)
{
kasan_disable_current();
kmsan_disable_current();
}

static inline void metadata_access_disable(void)
{
kmsan_enable_current();
kasan_enable_current();
}

/*
* Object debugging
*/

/* Verify that a pointer has an address that is valid within a slab page */
static inline int check_valid_pointer(struct kmem_cache *s,
    struct slab *slab, void *object)
{
void *base;

if (!object)
  return 1;

base = slab_address(slab);
object = kasan_reset_tag(object);
object = restore_red_left(s, object);
if (object < base || object >= base + slab->objects * s->size ||
  (object - base) % s->size) {
  return 0;
}

return 1;
}

static void print_section(char *level, char *text, u8 *addr,
     unsigned int length)
{
metadata_access_enable();
print_hex_dump(level, text, DUMP_PREFIX_ADDRESS,
   16, 1, kasan_reset_tag((void *)addr), length, 1);
metadata_access_disable();
}

static struct track *get_track(struct kmem_cache *s, void *object,
enum track_item alloc)
{
struct track *p;

p = object + get_info_end(s);

return kasan_reset_tag(p + alloc);
}

#ifdef CONFIG_STACKDEPOT
static noinline depot_stack_handle_t set_track_prepare(gfp_t gfp_flags)
{
depot_stack_handle_t handle;
unsigned long entries[TRACK_ADDRS_COUNT];
unsigned int nr_entries;

nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 3);
handle = stack_depot_save(entries, nr_entries, gfp_flags);

return handle;
}
#else
static inline depot_stack_handle_t set_track_prepare(gfp_t gfp_flags)
{
return 0;
}
#endif

static void set_track_update(struct kmem_cache *s, void *object,
        enum track_item alloc, unsigned long addr,
        depot_stack_handle_t handle)
{
struct track *p = get_track(s, object, alloc);

#ifdef CONFIG_STACKDEPOT
p->handle = handle;
#endif
p->addr = addr;
p->cpu = smp_processor_id();
p->pid = current->pid;
p->when = jiffies;
}

static __always_inline void set_track(struct kmem_cache *s, void *object,
          enum track_item alloc, unsigned long addr, gfp_t gfp_flags)
{
depot_stack_handle_t handle = set_track_prepare(gfp_flags);

set_track_update(s, object, alloc, addr, handle);
}

static void init_tracking(struct kmem_cache *s, void *object)
{
struct track *p;

if (!(s->flags & SLAB_STORE_USER))
  return;

p = get_track(s, object, TRACK_ALLOC);
memset(p, 0, 2*sizeof(struct track));
}

static void print_track(const char *s, struct track *t, unsigned long pr_time)
{
depot_stack_handle_t handle __maybe_unused;

if (!t->addr)
  return;

pr_err("%s in %pS age=%lu cpu=%u pid=%d\n",
        s, (void *)t->addr, pr_time - t->when, t->cpu, t->pid);
#ifdef CONFIG_STACKDEPOT
handle = READ_ONCE(t->handle);
if (handle)
  stack_depot_print(handle);
else
  pr_err("object allocation/free stack trace missing\n");
#endif
}

void print_tracking(struct kmem_cache *s, void *object)
{
unsigned long pr_time = jiffies;
if (!(s->flags & SLAB_STORE_USER))
  return;

print_track("Allocated", get_track(s, object, TRACK_ALLOC), pr_time);
print_track("Freed", get_track(s, object, TRACK_FREE), pr_time);
}

static void print_slab_info(const struct slab *slab)
{
pr_err("Slab 0x%p objects=%u used=%u fp=0x%p flags=%pGp\n",
        slab, slab->objects, slab->inuse, slab->freelist,
        &slab->flags);
}

void skip_orig_size_check(struct kmem_cache *s, const void *object)
{
set_orig_size(s, (void *)object, s->object_size);
}

static void __slab_bug(struct kmem_cache *s, const char *fmt, va_list argsp)
{
struct va_format vaf;
va_list args;

va_copy(args, argsp);
vaf.fmt = fmt;
vaf.va = &args;
pr_err("=============================================================================\n");
pr_err("BUG %s (%s): %pV\n", s ? s->name : "", print_tainted(), &vaf);
pr_err("-----------------------------------------------------------------------------\n\n");
va_end(args);
}

static void slab_bug(struct kmem_cache *s, const char *fmt, ...)
{
va_list args;

va_start(args, fmt);
__slab_bug(s, fmt, args);
va_end(args);
}

__printf(2, 3)
static void slab_fix(struct kmem_cache *s, const char *fmt, ...)
{
struct va_format vaf;
va_list args;

if (slab_add_kunit_errors())
  return;

va_start(args, fmt);
vaf.fmt = fmt;
vaf.va = &args;
pr_err("FIX %s: %pV\n", s->name, &vaf);
va_end(args);
}

static void print_trailer(struct kmem_cache *s, struct slab *slab, u8 *p)
{
unsigned int off; /* Offset of last byte */
u8 *addr = slab_address(slab);

print_tracking(s, p);

print_slab_info(slab);

pr_err("Object 0x%p @offset=%tu fp=0x%p\n\n",
        p, p - addr, get_freepointer(s, p));

if (s->flags & SLAB_RED_ZONE)
  print_section(KERN_ERR, "Redzone ", p - s->red_left_pad,
         s->red_left_pad);
else if (p > addr + 16)
  print_section(KERN_ERR, "Bytes b4 ", p - 16, 16);

print_section(KERN_ERR,         "Object ", p,
        min_t(unsigned int, s->object_size, PAGE_SIZE));
if (s->flags & SLAB_RED_ZONE)
  print_section(KERN_ERR, "Redzone ", p + s->object_size,
   s->inuse - s->object_size);

off = get_info_end(s);

if (s->flags & SLAB_STORE_USER)
  off += 2 * sizeof(struct track);

if (slub_debug_orig_size(s))
  off += sizeof(unsigned int);

off += kasan_metadata_size(s, false);

if (off != size_from_object(s))
  /* Beginning of the filler is the free pointer */
  print_section(KERN_ERR, "Padding ", p + off,
         size_from_object(s) - off);
}

static void object_err(struct kmem_cache *s, struct slab *slab,
   u8 *object, const char *reason)
{
if (slab_add_kunit_errors())
  return;

slab_bug(s, reason);
if (!object || !check_valid_pointer(s, slab, object)) {
  print_slab_info(slab);
  pr_err("Invalid pointer 0x%p\n", object);
} else {
  print_trailer(s, slab, object);
}
add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);

WARN_ON(1);
}

static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab,
          void **freelist, void *nextfree)
{
if ((s->flags & SLAB_CONSISTENCY_CHECKS) &&
     !check_valid_pointer(s, slab, nextfree) && freelist) {
  object_err(s, slab, *freelist, "Freechain corrupt");
  *freelist = NULL;
  slab_fix(s, "Isolate corrupted freechain");
  return true;
}

return false;
}

static void __slab_err(struct slab *slab)
{
if (slab_in_kunit_test())
  return;

print_slab_info(slab);
add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);

WARN_ON(1);
}

static __printf(3, 4) void slab_err(struct kmem_cache *s, struct slab *slab,
   const char *fmt, ...)
{
va_list args;

if (slab_add_kunit_errors())
  return;

va_start(args, fmt);
__slab_bug(s, fmt, args);
va_end(args);

__slab_err(slab);
}

static void init_object(struct kmem_cache *s, void *object, u8 val)
{
u8 *p = kasan_reset_tag(object);
unsigned int poison_size = s->object_size;

if (s->flags & SLAB_RED_ZONE) {
  /*
* Here and below, avoid overwriting the KMSAN shadow. Keeping
* the shadow makes it possible to distinguish uninit-value
* from use-after-free.
*/
  memset_no_sanitize_memory(p - s->red_left_pad, val,
       s->red_left_pad);

  if (slub_debug_orig_size(s) && val == SLUB_RED_ACTIVE) {
   /*
* Redzone the extra allocated space by kmalloc than
* requested, and the poison size will be limited to
* the original request size accordingly.
*/
   poison_size = get_orig_size(s, object);
  }
}

if (s->flags & __OBJECT_POISON) {
  memset_no_sanitize_memory(p, POISON_FREE, poison_size - 1);
  memset_no_sanitize_memory(p + poison_size - 1, POISON_END, 1);
}

if (s->flags & SLAB_RED_ZONE)
  memset_no_sanitize_memory(p + poison_size, val,
       s->inuse - poison_size);
}

static void restore_bytes(struct kmem_cache *s, const char *message, u8 data,
      void *from, void *to)
{
slab_fix(s, "Restoring %s 0x%p-0x%p=0x%x", message, from, to - 1, data);
memset(from, data, to - from);
}

#ifdef CONFIG_KMSAN
#define pad_check_attributes noinline __no_kmsan_checks
#else
#define pad_check_attributes
#endif

static pad_check_attributes int
check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
         u8 *object, const char *what, u8 *start, unsigned int value,
         unsigned int bytes, bool slab_obj_print)
{
u8 *fault;
u8 *end;
u8 *addr = slab_address(slab);

metadata_access_enable();
fault = memchr_inv(kasan_reset_tag(start), value, bytes);
metadata_access_disable();
if (!fault)
  return 1;

end = start + bytes;
while (end > fault && end[-1] == value)
  end--;

if (slab_add_kunit_errors())
  goto skip_bug_print;

pr_err("[%s overwritten] 0x%p-0x%p @offset=%tu. First byte 0x%x instead of 0x%x\n",
        what, fault, end - 1, fault - addr, fault[0], value);

if (slab_obj_print)
  object_err(s, slab, object, "Object corrupt");

skip_bug_print:
restore_bytes(s, what, value, fault, end);
return 0;
}

/*
* Object layout:
*
* object address
* Bytes of the object to be managed.
* If the freepointer may overlay the object then the free
* pointer is at the middle of the object.
*
* Poisoning uses 0x6b (POISON_FREE) and the last byte is
* 0xa5 (POISON_END)
*
* object + s->object_size
* Padding to reach word boundary. This is also used for Redzoning.
* Padding is extended by another word if Redzoning is enabled and
* object_size == inuse.
*
* We fill with 0xbb (SLUB_RED_INACTIVE) for inactive objects and with
* 0xcc (SLUB_RED_ACTIVE) for objects in use.
*
* object + s->inuse
* Meta data starts here.
*
* A. Free pointer (if we cannot overwrite object on free)
* B. Tracking data for SLAB_STORE_USER
* C. Original request size for kmalloc object (SLAB_STORE_USER enabled)
* D. Padding to reach required alignment boundary or at minimum
* one word if debugging is on to be able to detect writes
* before the word boundary.
*
* Padding is done using 0x5a (POISON_INUSE)
*
* object + s->size
* Nothing is used beyond s->size.
*
* If slabcaches are merged then the object_size and inuse boundaries are mostly
* ignored. And therefore no slab options that rely on these boundaries
* may be used with merged slabcaches.
*/

static int check_pad_bytes(struct kmem_cache *s, struct slab *slab, u8 *p)
{
unsigned long off = get_info_end(s); /* The end of info */

if (s->flags & SLAB_STORE_USER) {
  /* We also have user information there */
  off += 2 * sizeof(struct track);

  if (s->flags & SLAB_KMALLOC)
   off += sizeof(unsigned int);
}

off += kasan_metadata_size(s, false);

if (size_from_object(s) == off)
  return 1;

return check_bytes_and_report(s, slab, p, "Object padding",
   p + off, POISON_INUSE, size_from_object(s) - off, true);
}

/* Check the pad bytes at the end of a slab page */
static pad_check_attributes void
slab_pad_check(struct kmem_cache *s, struct slab *slab)
{
u8 *start;
u8 *fault;
u8 *end;
u8 *pad;
int length;
int remainder;

if (!(s->flags & SLAB_POISON))
  return;

start = slab_address(slab);
length = slab_size(slab);
end = start + length;
remainder = length % s->size;
if (!remainder)
  return;

pad = end - remainder;
metadata_access_enable();
fault = memchr_inv(kasan_reset_tag(pad), POISON_INUSE, remainder);
metadata_access_disable();
if (!fault)
  return;
while (end > fault && end[-1] == POISON_INUSE)
  end--;

slab_bug(s, "Padding overwritten. 0x%p-0x%p @offset=%tu",
   fault, end - 1, fault - start);
print_section(KERN_ERR, "Padding ", pad, remainder);
__slab_err(slab);

restore_bytes(s, "slab padding", POISON_INUSE, fault, end);
}

static int check_object(struct kmem_cache *s, struct slab *slab,
     void *object, u8 val)
{
u8 *p = object;
u8 *endobject = object + s->object_size;
unsigned int orig_size, kasan_meta_size;
int ret = 1;

if (s->flags & SLAB_RED_ZONE) {
  if (!check_bytes_and_report(s, slab, object, "Left Redzone",
   object - s->red_left_pad, val, s->red_left_pad, ret))
   ret = 0;

  if (!check_bytes_and_report(s, slab, object, "Right Redzone",
   endobject, val, s->inuse - s->object_size, ret))
   ret = 0;

  if (slub_debug_orig_size(s) && val == SLUB_RED_ACTIVE) {
   orig_size = get_orig_size(s, object);

   if (s->object_size > orig_size  &&
    !check_bytes_and_report(s, slab, object,
     "kmalloc Redzone", p + orig_size,
     val, s->object_size - orig_size, ret)) {
    ret = 0;
   }
  }
} else {
  if ((s->flags & SLAB_POISON) && s->object_size < s->inuse) {
   if (!check_bytes_and_report(s, slab, p, "Alignment padding",
    endobject, POISON_INUSE,
    s->inuse - s->object_size, ret))
    ret = 0;
  }
}

if (s->flags & SLAB_POISON) {
  if (val != SLUB_RED_ACTIVE && (s->flags & __OBJECT_POISON)) {
   /*
* KASAN can save its free meta data inside of the
* object at offset 0. Thus, skip checking the part of
* the redzone that overlaps with the meta data.
*/
   kasan_meta_size = kasan_metadata_size(s, true);
   if (kasan_meta_size < s->object_size - 1 &&
       !check_bytes_and_report(s, slab, p, "Poison",
     p + kasan_meta_size, POISON_FREE,
     s->object_size - kasan_meta_size - 1, ret))
    ret = 0;
   if (kasan_meta_size < s->object_size &&
       !check_bytes_and_report(s, slab, p, "End Poison",
     p + s->object_size - 1, POISON_END, 1, ret))
    ret = 0;
  }
  /*
* check_pad_bytes cleans up on its own.
*/
  if (!check_pad_bytes(s, slab, p))
   ret = 0;
}

/*
* Cannot check freepointer while object is allocated if
* object and freepointer overlap.
*/
if ((freeptr_outside_object(s) || val != SLUB_RED_ACTIVE) &&
     !check_valid_pointer(s, slab, get_freepointer(s, p))) {
  object_err(s, slab, p, "Freepointer corrupt");
  /*
* No choice but to zap it and thus lose the remainder
* of the free objects in this slab. May cause
* another error because the object count is now wrong.
*/
  set_freepointer(s, p, NULL);
  ret = 0;
}

return ret;
}

static int check_slab(struct kmem_cache *s, struct slab *slab)
{
int maxobj;

if (!folio_test_slab(slab_folio(slab))) {
  slab_err(s, slab, "Not a valid slab page");
  return 0;
}

maxobj = order_objects(slab_order(slab), s->size);
if (slab->objects > maxobj) {
  slab_err(s, slab, "objects %u > max %u",
   slab->objects, maxobj);
  return 0;
}
if (slab->inuse > slab->objects) {
  slab_err(s, slab, "inuse %u > max %u",
   slab->inuse, slab->objects);
  return 0;
}
if (slab->frozen) {
  slab_err(s, slab, "Slab disabled since SLUB metadata consistency check failed");
  return 0;
}

/* Slab_pad_check fixes things up after itself */
slab_pad_check(s, slab);
return 1;
}

/*
* Determine if a certain object in a slab is on the freelist. Must hold the
* slab lock to guarantee that the chains are in a consistent state.
*/
static bool on_freelist(struct kmem_cache *s, struct slab *slab, void *search)
{
int nr = 0;
void *fp;
void *object = NULL;
int max_objects;

fp = slab->freelist;
while (fp && nr <= slab->objects) {
  if (fp == search)
   return true;
  if (!check_valid_pointer(s, slab, fp)) {
   if (object) {
    object_err(s, slab, object,
     "Freechain corrupt");
    set_freepointer(s, object, NULL);
    break;
   } else {
    slab_err(s, slab, "Freepointer corrupt");
    slab->freelist = NULL;
    slab->inuse = slab->objects;
    slab_fix(s, "Freelist cleared");
    return false;
   }
  }
  object = fp;
  fp = get_freepointer(s, object);
  nr++;
}

if (nr > slab->objects) {
  slab_err(s, slab, "Freelist cycle detected");
  slab->freelist = NULL;
  slab->inuse = slab->objects;
  slab_fix(s, "Freelist cleared");
  return false;
}

max_objects = order_objects(slab_order(slab), s->size);
if (max_objects > MAX_OBJS_PER_PAGE)
  max_objects = MAX_OBJS_PER_PAGE;

if (slab->objects != max_objects) {
  slab_err(s, slab, "Wrong number of objects. Found %d but should be %d",
    slab->objects, max_objects);
  slab->objects = max_objects;
  slab_fix(s, "Number of objects adjusted");
}
if (slab->inuse != slab->objects - nr) {
  slab_err(s, slab, "Wrong object count. Counter is %d but counted were %d",
    slab->inuse, slab->objects - nr);
  slab->inuse = slab->objects - nr;
  slab_fix(s, "Object count adjusted");
}
return search == NULL;
}

static void trace(struct kmem_cache *s, struct slab *slab, void *object,
        int alloc)
{
if (s->flags & SLAB_TRACE) {
  pr_info("TRACE %s %s 0x%p inuse=%d fp=0x%p\n",
   s->name,
   alloc ? "alloc" : "free",
   object, slab->inuse,
   slab->freelist);

  if (!alloc)
   print_section(KERN_INFO, "Object ", (void *)object,
     s->object_size);

  dump_stack();
}
}

/*
* Tracking of fully allocated slabs for debugging purposes.
*/
static void add_full(struct kmem_cache *s,
struct kmem_cache_node *n, struct slab *slab)
{
if (!(s->flags & SLAB_STORE_USER))
  return;

lockdep_assert_held(&n->list_lock);
list_add(&slab->slab_list, &n->full);
}

static void remove_full(struct kmem_cache *s, struct kmem_cache_node *n, struct slab *slab)
{
if (!(s->flags & SLAB_STORE_USER))
  return;

lockdep_assert_held(&n->list_lock);
list_del(&slab->slab_list);
}

static inline unsigned long node_nr_slabs(struct kmem_cache_node *n)
{
return atomic_long_read(&n->nr_slabs);
}

static inline void inc_slabs_node(struct kmem_cache *s, int node, int objects)
{
struct kmem_cache_node *n = get_node(s, node);

atomic_long_inc(&n->nr_slabs);
atomic_long_add(objects, &n->total_objects);
}
static inline void dec_slabs_node(struct kmem_cache *s, int node, int objects)
{
struct kmem_cache_node *n = get_node(s, node);

atomic_long_dec(&n->nr_slabs);
atomic_long_sub(objects, &n->total_objects);
}

/* Object debug checks for alloc/free paths */
static void setup_object_debug(struct kmem_cache *s, void *object)
{
if (!kmem_cache_debug_flags(s, SLAB_STORE_USER|SLAB_RED_ZONE|__OBJECT_POISON))
  return;

init_object(s, object, SLUB_RED_INACTIVE);
init_tracking(s, object);
}

static
void setup_slab_debug(struct kmem_cache *s, struct slab *slab, void *addr)
{
if (!kmem_cache_debug_flags(s, SLAB_POISON))
  return;

metadata_access_enable();
memset(kasan_reset_tag(addr), POISON_INUSE, slab_size(slab));
metadata_access_disable();
}

static inline int alloc_consistency_checks(struct kmem_cache *s,
     struct slab *slab, void *object)
{
if (!check_slab(s, slab))
  return 0;

if (!check_valid_pointer(s, slab, object)) {
  object_err(s, slab, object, "Freelist Pointer check fails");
  return 0;
}

if (!check_object(s, slab, object, SLUB_RED_INACTIVE))
  return 0;

return 1;
}

static noinline bool alloc_debug_processing(struct kmem_cache *s,
   struct slab *slab, void *object, int orig_size)
{
if (s->flags & SLAB_CONSISTENCY_CHECKS) {
  if (!alloc_consistency_checks(s, slab, object))
   goto bad;
}

/* Success. Perform special debug activities for allocs */
trace(s, slab, object, 1);
set_orig_size(s, object, orig_size);
init_object(s, object, SLUB_RED_ACTIVE);
return true;

bad:
if (folio_test_slab(slab_folio(slab))) {
  /*
* If this is a slab page then lets do the best we can
* to avoid issues in the future. Marking all objects
* as used avoids touching the remaining objects.
*/
  slab_fix(s, "Marking all objects used");
  slab->inuse = slab->objects;
  slab->freelist = NULL;
  slab->frozen = 1; /* mark consistency-failed slab as frozen */
}
return false;
}

static inline int free_consistency_checks(struct kmem_cache *s,
  struct slab *slab, void *object, unsigned long addr)
{
if (!check_valid_pointer(s, slab, object)) {
  slab_err(s, slab, "Invalid object pointer 0x%p", object);
  return 0;
}

if (on_freelist(s, slab, object)) {
  object_err(s, slab, object, "Object already free");
  return 0;
}

if (!check_object(s, slab, object, SLUB_RED_ACTIVE))
  return 0;

if (unlikely(s != slab->slab_cache)) {
  if (!folio_test_slab(slab_folio(slab))) {
   slab_err(s, slab, "Attempt to free object(0x%p) outside of slab",
     object);
  } else if (!slab->slab_cache) {
   slab_err(NULL, slab, "No slab cache for object 0x%p",
     object);
  } else {
   object_err(s, slab, object,
       "page slab pointer corrupt.");
  }
  return 0;
}
return 1;
}

/*
* Parse a block of slab_debug options. Blocks are delimited by ';'
*
* @str:    start of block
* @flags:  returns parsed flags, or DEBUG_DEFAULT_FLAGS if none specified
* @slabs:  return start of list of slabs, or NULL when there's no list
* @init:   assume this is initial parsing and not per-kmem-create parsing
*
* returns the start of next block if there's any, or NULL
*/
static char *
parse_slub_debug_flags(char *str, slab_flags_t *flags, char **slabs, bool init)
{
bool higher_order_disable = false;

/* Skip any completely empty blocks */
while (*str && *str == ';')
  str++;

if (*str == ',') {
  /*
* No options but restriction on slabs. This means full
* debugging for slabs matching a pattern.
*/
  *flags = DEBUG_DEFAULT_FLAGS;
  goto check_slabs;
}
*flags = 0;

/* Determine which debug features should be switched on */
for (; *str && *str != ',' && *str != ';'; str++) {
  switch (tolower(*str)) {
  case '-':
   *flags = 0;
   break;
  case 'f':
   *flags |= SLAB_CONSISTENCY_CHECKS;
   break;
  case 'z':
   *flags |= SLAB_RED_ZONE;
   break;
  case 'p':
   *flags |= SLAB_POISON;
   break;
  case 'u':
   *flags |= SLAB_STORE_USER;
   break;
  case 't':
   *flags |= SLAB_TRACE;
   break;
  case 'a':
   *flags |= SLAB_FAILSLAB;
   break;
  case 'o':
   /*
* Avoid enabling debugging on caches if its minimum
* order would increase as a result.
*/
   higher_order_disable = true;
   break;
  default:
   if (init)
    pr_err("slab_debug option '%c' unknown. skipped\n", *str);
  }
}
check_slabs:
if (*str == ',')
  *slabs = ++str;
else
  *slabs = NULL;

/* Skip over the slab list */
while (*str && *str != ';')
  str++;

/* Skip any completely empty blocks */
while (*str && *str == ';')
  str++;

if (init && higher_order_disable)
  disable_higher_order_debug = 1;

if (*str)
  return str;
else
  return NULL;
}

static int __init setup_slub_debug(char *str)
{
slab_flags_t flags;
slab_flags_t global_flags;
char *saved_str;
char *slab_list;
bool global_slub_debug_changed = false;
bool slab_list_specified = false;

global_flags = DEBUG_DEFAULT_FLAGS;
if (*str++ != '=' || !*str)
  /*
* No options specified. Switch on full debugging.
*/
  goto out;

saved_str = str;
while (str) {
  str = parse_slub_debug_flags(str, &flags, &slab_list, true);

  if (!slab_list) {
   global_flags = flags;
   global_slub_debug_changed = true;
  } else {
   slab_list_specified = true;
   if (flags & SLAB_STORE_USER)
    stack_depot_request_early_init();
  }
}

/*
* For backwards compatibility, a single list of flags with list of
* slabs means debugging is only changed for those slabs, so the global
* slab_debug should be unchanged (0 or DEBUG_DEFAULT_FLAGS, depending
* on CONFIG_SLUB_DEBUG_ON). We can extended that to multiple lists as
* long as there is no option specifying flags without a slab list.
*/
if (slab_list_specified) {
  if (!global_slub_debug_changed)
   global_flags = slub_debug;
  slub_debug_string = saved_str;
}
out:
slub_debug = global_flags;
if (slub_debug & SLAB_STORE_USER)
  stack_depot_request_early_init();
if (slub_debug != 0 || slub_debug_string)
  static_branch_enable(&slub_debug_enabled);
else
  static_branch_disable(&slub_debug_enabled);
if ((static_branch_unlikely(&init_on_alloc) ||
      static_branch_unlikely(&init_on_free)) &&
     (slub_debug & SLAB_POISON))
  pr_info("mem auto-init: SLAB_POISON will take precedence over init_on_alloc/init_on_free\n");
return 1;
}

__setup("slab_debug", setup_slub_debug);
__setup_param("slub_debug", slub_debug, setup_slub_debug, 0);

/*
* kmem_cache_flags - apply debugging options to the cache
* @flags: flags to set
* @name: name of the cache
*
* Debug option(s) are applied to @flags. In addition to the debug
* option(s), if a slab name (or multiple) is specified i.e.
* slab_debug=<Debug-Options>,<slab name1>,<slab name2> ...
* then only the select slabs will receive the debug option(s).
*/
slab_flags_t kmem_cache_flags(slab_flags_t flags, const char *name)
{
char *iter;
size_t len;
char *next_block;
slab_flags_t block_flags;
slab_flags_t slub_debug_local = slub_debug;

if (flags & SLAB_NO_USER_FLAGS)
  return flags;

/*
* If the slab cache is for debugging (e.g. kmemleak) then
* don't store user (stack trace) information by default,
* but let the user enable it via the command line below.
*/
if (flags & SLAB_NOLEAKTRACE)
  slub_debug_local &= ~SLAB_STORE_USER;

len = strlen(name);
next_block = slub_debug_string;
/* Go through all blocks of debug options, see if any matches our slab's name */
while (next_block) {
  next_block = parse_slub_debug_flags(next_block, &block_flags, &iter, false);
  if (!iter)
   continue;
  /* Found a block that has a slab list, search it */
  while (*iter) {
   char *end, *glob;
   size_t cmplen;

   end = strchrnul(iter, ',');
   if (next_block && next_block < end)
    end = next_block - 1;

   glob = strnchr(iter, end - iter, '*');
   if (glob)
    cmplen = glob - iter;
   else
    cmplen = max_t(size_t, len, (end - iter));

   if (!strncmp(name, iter, cmplen)) {
    flags |= block_flags;
    return flags;
   }

   if (!*end || *end == ';')
    break;
   iter = end + 1;
  }
}

return flags | slub_debug_local;
}
#else /* !CONFIG_SLUB_DEBUG */
static inline void setup_object_debug(struct kmem_cache *s, void *object) {}
static inline
void setup_slab_debug(struct kmem_cache *s, struct slab *slab, void *addr) {}

static inline bool alloc_debug_processing(struct kmem_cache *s,
struct slab *slab, void *object, int orig_size) { return true; }

static inline bool free_debug_processing(struct kmem_cache *s,
struct slab *slab, void *head, void *tail, int *bulk_cnt,
unsigned long addr, depot_stack_handle_t handle) { return true; }

static inline void slab_pad_check(struct kmem_cache *s, struct slab *slab) {}
static inline int check_object(struct kmem_cache *s, struct slab *slab,
   void *object, u8 val) { return 1; }
static inline depot_stack_handle_t set_track_prepare(gfp_t gfp_flags) { return 0; }
static inline void set_track(struct kmem_cache *s, void *object,
        enum track_item alloc, unsigned long addr, gfp_t gfp_flags) {}
static inline void add_full(struct kmem_cache *s, struct kmem_cache_node *n,
     struct slab *slab) {}
static inline void remove_full(struct kmem_cache *s, struct kmem_cache_node *n,
     struct slab *slab) {}
slab_flags_t kmem_cache_flags(slab_flags_t flags, const char *name)
{
return flags;
}
#define slub_debug 0

#define disable_higher_order_debug 0

static inline unsigned long node_nr_slabs(struct kmem_cache_node *n)
       { return 0; }
static inline void inc_slabs_node(struct kmem_cache *s, int node,
       int objects) {}
static inline void dec_slabs_node(struct kmem_cache *s, int node,
       int objects) {}
#ifndef CONFIG_SLUB_TINY
static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab,
          void **freelist, void *nextfree)
{
return false;
}
#endif
#endif /* CONFIG_SLUB_DEBUG */

#ifdef CONFIG_SLAB_OBJ_EXT

#ifdef CONFIG_MEM_ALLOC_PROFILING_DEBUG

static inline void mark_objexts_empty(struct slabobj_ext *obj_exts)
{
struct slabobj_ext *slab_exts;
struct slab *obj_exts_slab;

obj_exts_slab = virt_to_slab(obj_exts);
slab_exts = slab_obj_exts(obj_exts_slab);
if (slab_exts) {
  unsigned int offs = obj_to_index(obj_exts_slab->slab_cache,
       obj_exts_slab, obj_exts);

  if (unlikely(is_codetag_empty(&slab_exts[offs].ref)))
   return;

  /* codetag should be NULL here */
  WARN_ON(slab_exts[offs].ref.ct);
  set_codetag_empty(&slab_exts[offs].ref);
}
}

static inline bool mark_failed_objexts_alloc(struct slab *slab)
{
return cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL) == 0;
}

static inline void handle_failed_objexts_alloc(unsigned long obj_exts,
   struct slabobj_ext *vec, unsigned int objects)
{
/*
* If vector previously failed to allocate then we have live
* objects with no tag reference. Mark all references in this
* vector as empty to avoid warnings later on.
*/
if (obj_exts & OBJEXTS_ALLOC_FAIL) {
  unsigned int i;

  for (i = 0; i < objects; i++)
   set_codetag_empty(&vec[i].ref);
}
}

#else /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */

static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) {}
static inline bool mark_failed_objexts_alloc(struct slab *slab) { return false; }
static inline void handle_failed_objexts_alloc(unsigned long obj_exts,
   struct slabobj_ext *vec, unsigned int objects) {}

#endif /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */

/*
* The allocated objcg pointers array is not accounted directly.
* Moreover, it should not come from DMA buffer and is not readily
* reclaimable. So those GFP bits should be masked off.
*/
#define OBJCGS_CLEAR_MASK (__GFP_DMA | __GFP_RECLAIMABLE | \
    __GFP_ACCOUNT | __GFP_NOFAIL)

static inline void init_slab_obj_exts(struct slab *slab)
{
slab->obj_exts = 0;
}

int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
          gfp_t gfp, bool new_slab)
{
unsigned int objects = objs_per_slab(s, slab);
unsigned long new_exts;
unsigned long old_exts;
struct slabobj_ext *vec;

gfp &= ~OBJCGS_CLEAR_MASK;
/* Prevent recursive extension vector allocation */
gfp |= __GFP_NO_OBJ_EXT;
vec = kcalloc_node(objects, sizeof(struct slabobj_ext), gfp,
      slab_nid(slab));
if (!vec) {
  /*
* Try to mark vectors which failed to allocate.
* If this operation fails, there may be a racing process
* that has already completed the allocation.
*/
  if (!mark_failed_objexts_alloc(slab) &&
      slab_obj_exts(slab))
   return 0;

  return -ENOMEM;
}

new_exts = (unsigned long)vec;
#ifdef CONFIG_MEMCG
new_exts |= MEMCG_DATA_OBJEXTS;
#endif
retry:
old_exts = READ_ONCE(slab->obj_exts);
handle_failed_objexts_alloc(old_exts, vec, objects);
if (new_slab) {
  /*
* If the slab is brand new and nobody can yet access its
* obj_exts, no synchronization is required and obj_exts can
* be simply assigned.
*/
  slab->obj_exts = new_exts;
} else if (old_exts & ~OBJEXTS_FLAGS_MASK) {
  /*
* If the slab is already in use, somebody can allocate and
* assign slabobj_exts in parallel. In this case the existing
* objcg vector should be reused.
*/
  mark_objexts_empty(vec);
  kfree(vec);
  return 0;
} else if (cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
  /* Retry if a racing thread changed slab->obj_exts from under us. */
  goto retry;
}

kmemleak_not_leak(vec);
return 0;
}

static inline void free_slab_obj_exts(struct slab *slab)
{
struct slabobj_ext *obj_exts;

obj_exts = slab_obj_exts(slab);
if (!obj_exts) {
  /*
* If obj_exts allocation failed, slab->obj_exts is set to
* OBJEXTS_ALLOC_FAIL. In this case, we end up here and should
* clear the flag.
*/
  slab->obj_exts = 0;
  return;
}

/*
* obj_exts was created with __GFP_NO_OBJ_EXT flag, therefore its
* corresponding extension will be NULL. alloc_tag_sub() will throw a
* warning if slab has extensions but the extension of an object is
* NULL, therefore replace NULL with CODETAG_EMPTY to indicate that
* the extension for obj_exts is expected to be NULL.
*/
mark_objexts_empty(obj_exts);
kfree(obj_exts);
slab->obj_exts = 0;
}

#else /* CONFIG_SLAB_OBJ_EXT */

static inline void init_slab_obj_exts(struct slab *slab)
{
}

static int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
          gfp_t gfp, bool new_slab)
{
return 0;
}

static inline void free_slab_obj_exts(struct slab *slab)
{
}

#endif /* CONFIG_SLAB_OBJ_EXT */

#ifdef CONFIG_MEM_ALLOC_PROFILING

static inline struct slabobj_ext *
prepare_slab_obj_exts_hook(struct kmem_cache *s, gfp_t flags, void *p)
{
struct slab *slab;

if (!p)
  return NULL;

if (s->flags & (SLAB_NO_OBJ_EXT | SLAB_NOLEAKTRACE))
  return NULL;

if (flags & __GFP_NO_OBJ_EXT)
  return NULL;

slab = virt_to_slab(p);
if (!slab_obj_exts(slab) &&
     alloc_slab_obj_exts(slab, s, flags, false)) {
  pr_warn_once("%s, %s: Failed to create slab extension vector!\n",
        __func__, s->name);
  return NULL;
}

return slab_obj_exts(slab) + obj_to_index(s, slab, p);
}

/* Should be called only if mem_alloc_profiling_enabled() */
static noinline void
__alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags)
{
struct slabobj_ext *obj_exts;

obj_exts = prepare_slab_obj_exts_hook(s, flags, object);
/*
* Currently obj_exts is used only for allocation profiling.
* If other users appear then mem_alloc_profiling_enabled()
* check should be added before alloc_tag_add().
*/
if (likely(obj_exts))
  alloc_tag_add(&obj_exts->ref, current->alloc_tag, s->size);
}

static inline void
alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags)
{
if (mem_alloc_profiling_enabled())
  __alloc_tagging_slab_alloc_hook(s, object, flags);
}

/* Should be called only if mem_alloc_profiling_enabled() */
static noinline void
__alloc_tagging_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p,
          int objects)
{
struct slabobj_ext *obj_exts;
int i;

/* slab->obj_exts might not be NULL if it was created for MEMCG accounting. */
if (s->flags & (SLAB_NO_OBJ_EXT | SLAB_NOLEAKTRACE))
  return;

obj_exts = slab_obj_exts(slab);
if (!obj_exts)
  return;

for (i = 0; i < objects; i++) {
  unsigned int off = obj_to_index(s, slab, p[i]);

  alloc_tag_sub(&obj_exts[off].ref, s->size);
}
}

static inline void
alloc_tagging_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p,
        int objects)
{
if (mem_alloc_profiling_enabled())
  __alloc_tagging_slab_free_hook(s, slab, p, objects);
}

#else /* CONFIG_MEM_ALLOC_PROFILING */

static inline void
alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags)
{
}

static inline void
alloc_tagging_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p,
        int objects)
{
}

#endif /* CONFIG_MEM_ALLOC_PROFILING */

#ifdef CONFIG_MEMCG

static void memcg_alloc_abort_single(struct kmem_cache *s, void *object);

static __fastpath_inline
bool memcg_slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
    gfp_t flags, size_t size, void **p)
{
if (likely(!memcg_kmem_online()))
  return true;

if (likely(!(flags & __GFP_ACCOUNT) && !(s->flags & SLAB_ACCOUNT)))
  return true;

if (likely(__memcg_slab_post_alloc_hook(s, lru, flags, size, p)))
  return true;

if (likely(size == 1)) {
  memcg_alloc_abort_single(s, *p);
  *p = NULL;
} else {
  kmem_cache_free_bulk(s, size, p);
}

return false;
}

static __fastpath_inline
void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p,
     int objects)
{
struct slabobj_ext *obj_exts;

if (!memcg_kmem_online())
  return;

obj_exts = slab_obj_exts(slab);
if (likely(!obj_exts))
  return;

__memcg_slab_free_hook(s, slab, p, objects, obj_exts);
}

static __fastpath_inline
bool memcg_slab_post_charge(void *p, gfp_t flags)
{
struct slabobj_ext *slab_exts;
struct kmem_cache *s;
struct folio *folio;
struct slab *slab;
unsigned long off;

folio = virt_to_folio(p);
if (!folio_test_slab(folio)) {
  int size;

  if (folio_memcg_kmem(folio))
   return true;

  if (__memcg_kmem_charge_page(folio_page(folio, 0), flags,
          folio_order(folio)))
   return false;

  /*
* This folio has already been accounted in the global stats but
* not in the memcg stats. So, subtract from the global and use
* the interface which adds to both global and memcg stats.
*/
  size = folio_size(folio);
  node_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B, -size);
  lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B, size);
  return true;
}

slab = folio_slab(folio);
s = slab->slab_cache;

/*
* Ignore KMALLOC_NORMAL cache to avoid possible circular dependency
* of slab_obj_exts being allocated from the same slab and thus the slab
* becoming effectively unfreeable.
*/
if (is_kmalloc_normal(s))
  return true;

/* Ignore already charged objects. */
slab_exts = slab_obj_exts(slab);
if (slab_exts) {
  off = obj_to_index(s, slab, p);
  if (unlikely(slab_exts[off].objcg))
   return true;
}

return __memcg_slab_post_alloc_hook(s, NULL, flags, 1, &p);
}

#else /* CONFIG_MEMCG */
static inline bool memcg_slab_post_alloc_hook(struct kmem_cache *s,
           struct list_lru *lru,
           gfp_t flags, size_t size,
           void **p)
{
return true;
}

static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
     void **p, int objects)
{
}

static inline bool memcg_slab_post_charge(void *p, gfp_t flags)
{
return true;
}
#endif /* CONFIG_MEMCG */

#ifdef CONFIG_SLUB_RCU_DEBUG
static void slab_free_after_rcu_debug(struct rcu_head *rcu_head);

struct rcu_delayed_free {
struct rcu_head head;
void *object;
};
#endif

/*
* Hooks for other subsystems that check memory allocations. In a typical
* production configuration these hooks all should produce no code at all.
*
* Returns true if freeing of the object can proceed, false if its reuse
* was delayed by CONFIG_SLUB_RCU_DEBUG or KASAN quarantine, or it was returned
* to KFENCE.
*/
static __always_inline
bool slab_free_hook(struct kmem_cache *s, void *x, bool init,
      bool after_rcu_delay)
{
/* Are the object contents still accessible? */
bool still_accessible = (s->flags & SLAB_TYPESAFE_BY_RCU) && !after_rcu_delay;

kmemleak_free_recursive(x, s->flags);
kmsan_slab_free(s, x);

debug_check_no_locks_freed(x, s->object_size);

if (!(s->flags & SLAB_DEBUG_OBJECTS))
  debug_check_no_obj_freed(x, s->object_size);

/* Use KCSAN to help debug racy use-after-free. */
if (!still_accessible)
  __kcsan_check_access(x, s->object_size,
         KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ASSERT);

if (kfence_free(x))
  return false;

/*
* Give KASAN a chance to notice an invalid free operation before we
* modify the object.
*/
if (kasan_slab_pre_free(s, x))
  return false;

#ifdef CONFIG_SLUB_RCU_DEBUG
if (still_accessible) {
  struct rcu_delayed_free *delayed_free;

  delayed_free = kmalloc(sizeof(*delayed_free), GFP_NOWAIT);
  if (delayed_free) {
   /*
* Let KASAN track our call stack as a "related work
* creation", just like if the object had been freed
* normally via kfree_rcu().
* We have to do this manually because the rcu_head is
* not located inside the object.
*/
   kasan_record_aux_stack(x);

   delayed_free->object = x;
   call_rcu(&delayed_free->head, slab_free_after_rcu_debug);
   return false;
  }
}
#endif /* CONFIG_SLUB_RCU_DEBUG */

/*
* As memory initialization might be integrated into KASAN,
* kasan_slab_free and initialization memset's must be
* kept together to avoid discrepancies in behavior.
*
* The initialization memset's clear the object and the metadata,
* but don't touch the SLAB redzone.
*
* The object's freepointer is also avoided if stored outside the
* object.
*/
if (unlikely(init)) {
  int rsize;
--> --------------------

--> maximum size reached

--> --------------------

Messung V0.5

¤ Dauer der Verarbeitung: 0.21 Sekunden ¤

Wurzel

Suchen

Beweissystem der NASA

Beweissystem Isabelle

NIST Cobol Testsuite

Cephes Mathematical Library

Wiener Entwicklungsmethode

Haftungshinweis

Die Informationen auf dieser Webseite wurden nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit, noch Qualität der bereit gestellten Informationen zugesichert.