MySQL, Source Code Analysis, Memory Allocation Mechanism

Posted by jonorr1 on Sat, 01 Jun 2019 23:42:09 +0200


Memory resources are managed by the operating system, and allocation and recovery operations may perform system calls (for example, malloc algorithm, the larger memory space allocation interface is mmap, and the smaller space is not returned to the operating system after free). Frequent system calls will inevitably reduce system performance, but can maximize the use of used memory to other processes. On the contrary, occupying memory resources for a long time can reduce the number of system calls, but insufficient memory resources will lead to frequent page changes in the operating system and reduce the overall performance of the server.

The database is the "big user" of memory, so a reasonable memory allocation mechanism is particularly important. Memory context of PostgreSQL This article will introduce how to manage memory in MySQL.

Basic Interface Packaging

MySQL encapsulates a layer on the basic memory operation interface, adding the control parameter my_flags

void *my_malloc(size_t size, myf my_flags)
void *my_realloc(void *oldpoint, size_t size, myf my_flags)
void my_free(void *ptr)

The value of my_flags is currently:

MY_FAE 		/* Fatal if any error */
MY_WME			/* Write message on error */
MY_ZEROFILL	/* Fill array with zero */

MY_FAE denotes that if memory allocation fails, the whole process will exit. MY_WME denotes whether memory allocation failures need to be logged. MY_ZEROFILL denotes that memory allocation is initialized to 0 after allocation.


Basic structure

MEM_ROOT structure is widely used in Server layer of MySQL to manage memory, avoid frequent invocation of encapsulated basic interfaces, and also can be uniformly allocated and managed to prevent memory leaks. Different MEM_ROOTs have no influence on each other, unlike PG where different memory contexts are associated. This may be due to the fact that MySQL Server layer is object-oriented code, and MEM_ROOT is a member variable in the class, along with the whole life cycle of the object. Typical classes are THD, String, TABLE, TABLE_SHARE, Query_arena, st_transactions, etc.

The unit of MEM_ROOT memory allocation is Block, which is described by USED_MEM structure. The structure is relatively simple. Blocks are linked to each other to form a list of memory blocks. left and size indicate how much allocatable space there is for the corresponding block and the total space size.

typedef struct st_used_mem
{				   /* struct for once_alloc (block) */
  struct st_used_mem *next;	   /* Next block in use */
  unsigned int	left;		   /* memory left in block  */
  unsigned int	size;		   /* size of block */

The MEM_ROOT structure manages the Block list:

typedef struct st_mem_root
  USED_MEM *free;                  /* blocks with free memory in it */
  USED_MEM *used;                  /* blocks almost without free memory */
  USED_MEM *pre_alloc;             /* preallocated block */
  /* if block have less memory it will be put in 'used' list */
  size_t min_malloc;
  size_t block_size;               /* initial block size */
  unsigned int block_num;          /* allocated blocks counter */
     first free block in queue test counter (if it exceed 
     MAX_BLOCK_USAGE_BEFORE_DROP block will be dropped in 'used' list)
  unsigned int first_block_usage;

  void (*error_handler)(void);

The overall structure is two Block lists. Free lists manage all Blocks that still have allocatable space. used lists manage all Blocks that have no allocatable space. Pre_alloc is similar to the keeper in the context of PG memory. When initializing MEM_ROOT, a block can be pre-allocated to the free list. When freeing the whole MEM_ROOT, the block pointed to by pre_alloc can be selected by parameter control. min_malloc controls how much time a Block has left, removes it from the free list and adds it to the use list. block_size denotes the size of the initialization block. block_num denotes the number of blocks managed by MEM_ROOT. The first_block_usage represents the number of times that the first block in the free list does not satisfy the size of the application space, and is an optimized parameter. Er_handler is an error handling function.

Distribution process

With MEM_ROOT, initialization is needed first, and init_alloc_root is called. The size of the initialized Block and pre_alloc_size can be controlled by parameters. The interesting point is that min_block_size specifies a value of 32 directly, which is not very flexible for individuals, and may have large memory fragments for small memory applications. Another is that block_num is initialized to 4, which is related to the decision of the new allocated Block size policy.

void init_alloc_root(MEM_ROOT *mem_root, size_t block_size,
                     size_t pre_alloc_size __attribute__((unused)))
  mem_root->free= mem_root->used= mem_root->pre_alloc= 0;
  mem_root->min_malloc= 32;
  mem_root->block_size= block_size - ALLOC_ROOT_MIN_BLOCK_SIZE;
  mem_root->error_handler= 0;
  mem_root->block_num= 4;                       /* We shift this with >>2 */
  mem_root->first_block_usage= 0;

  if (pre_alloc_size)
    if ((mem_root->free= mem_root->pre_alloc=
         (USED_MEM*) my_malloc(pre_alloc_size+ ALIGN_SIZE(sizeof(USED_MEM)),
      mem_root->free->size= pre_alloc_size+ALIGN_SIZE(sizeof(USED_MEM));
      mem_root->free->left= pre_alloc_size;
      mem_root->free->next= 0;
      rds_update_query_size(mem_root, mem_root->free->size, 0);

alloc_root can be invoked to apply for memory after initialization. The whole allocation process is not complicated and the code is not too long. In order to facilitate reading and posting, you can also skip the direct analysis.

void *alloc_root( MEM_ROOT *mem_root, size_t length )
    size_t        get_size, block_size;
    uchar        * point;
    reg1 USED_MEM    *next = 0;
    reg2 USED_MEM    **prev;

    length = ALIGN_SIZE( length );
    if ( (*(prev = &mem_root->free) ) != NULL ) // Judging whether the free list is empty
        if ( (*prev)->left < length &&
             mem_root->first_block_usage++ >= ALLOC_MAX_BLOCK_USAGE_BEFORE_DROP &&
             (*prev)->left < ALLOC_MAX_BLOCK_TO_DROP ) // Optimization strategy
            next                = *prev;
            *prev                = next->next; /* Remove block from list */
            next->next            = mem_root->used;
            mem_root->used            = next;
            mem_root->first_block_usage    = 0;
        // Find a Block with more free space than the requested memory space 
        for ( next = *prev; next && next->left < length; next = next->next )
            prev = &next->next;
    if ( !next ) // free list is empty, or Block does not satisfy assignable conditions
    {       /* Time to alloc new block */
        block_size    = mem_root->block_size * (mem_root->block_num >> 2);
        get_size    = length + ALIGN_SIZE( sizeof(USED_MEM) );
        get_size    = MY_MAX( get_size, block_size );

        if ( !(next = (USED_MEM *) my_malloc( get_size, MYF( MY_WME | ME_FATALERROR ) ) ) )
            if ( mem_root->error_handler )
            DBUG_RETURN( (void *) 0 );                              /* purecov: inspected */
        next->next    = *prev;
        next->size    = get_size;
        next->left    = get_size - ALIGN_SIZE( sizeof(USED_MEM) );    
        *prev        = next;		// The new application Block is placed at the end of the free list

    point = (uchar *) ( (char *) next + (next->size - next->left) );
    if ( (next->left -= length) < mem_root->min_malloc )  // After allocation, can Block continue to allocate in the free list?
    {                                                                       /* Full block */
        *prev                = next->next;                   /* Remove block from list */
        next->next            = mem_root->used;
        mem_root->used            = next;
        mem_root->first_block_usage    = 0;

First of all, we should judge whether the free list is empty. If it is not, we should logically traverse the whole list to find a Block with enough free space. But the code first executes a judgment statement. This is actually an optimization strategy of space-for-time, because the free list is not empty in most cases. Almost every allocation needs to be from the first of the free list. Of course, we hope that the first Block can meet the requirements immediately without scanning the free list. So according to the application trend of the caller, we set two variables: ALLOC_MAX_BLOCK_USAGE_BEFORE_DROP and ALLOC_MAX_BLOCK_TO_DROP. When the number of applications for the first Block of the free list exceeds ALLOC_MAX_Block_USA_BEFORE_DROP and the remaining empty list. If the free space is less than ALLOC_MAX_BLOCK_TO_DROP, put this Block in the use list because it has been unable to meet the needs of the caller for some time.

If a suitable Block is not found in the free list, it is necessary to call the base interface to apply for a new memory space. The size of the new memory space must satisfy the size of the application at least. At the same time, the estimated size of the new Block is: mem_root-> block_size* (mem_root-> block_num >> 2), that is, the size of the initialized Block is multiplied by one fourth of the current number of Blocks, so the initial size of the new Block is 1/4 of the current number of Blocks. The block_num of the initialization MEM_ROOT is at least 4.

Find the appropriate Block and locate it in the available space. Before returning, you need to determine whether you need to move to the use list after the Block allocation.

There are two interfaces for returning memory space: mark_blocks_free(MEM_ROOT *root) and free_root(MEN_ROOT *root, myf MyFlags). It can be seen that the parameters of the two functions are not like the interface encapsulated in the base. There is no pointer for returning space directly, and the pointer for MEM_ROOT structure is passed in, which indicates that the memory space allocated by MEM_ROOT is returned uniformly. Mark_blocks_free does not really return the Block, but puts the marker in the free list to be available. Free_root really returns space to the operating system, and MyFlages can control whether the block pointed to by pre_alloc is returned, as well as whether it behaves like a function marked for deletion.


  • In terms of space utilization, MEM_ROOT's memory management mode is continuously allocated on each Block. Internal debris is basically at the tail of each Block. It is determined and controlled by min_malloc member variable and parameter ALLOC_MAX_BLOCK_USAGE_BEFORE_DROP, ALLOC_MAX_BLOCK_TO_DROP. However, the value of min_malloc is written in code, which is not flexible enough to consider writing. Configurable, and if you write more space than the length of the application, it is likely to cover the data behind, which is dangerous. However, compared with the memory context of PG, the utilization of space will certainly be much higher.
  • In terms of time utilization, without providing a free Block operation, the entire MEM_ROOT will be returned to the operating system only after it has been used up, which shows that MySQL is still greedy in memory.
  • In terms of usage, because MySQL has multiple storage engines, the Server layer above the engine is object-oriented C++ code, MEM_ROOT is often used as a member variable in the object, allocating memory space in the object's life cycle, recycling when the object is destructed, and the engine's memory application uses the encapsulated basic interface. By contrast, MySQL is more diverse, and PG is more unified and integrated.

Topics: MySQL less Database PostgreSQL