Experiment 6 of deep understanding computer system (CSAPP) - Malloc Lab

Posted by anonymouse on Sat, 25 Dec 2021 10:47:58 +0100

This experiment is too, too, too difficult! I can't write it myself. After reading two articles by two big men, I found out. Directly paste the two articles of the boss. One is implemented with an implicit linked list, and the other is implemented with a display linked list.

Implicit free linked list

The job is compiled into a 32-bit program. I use the WSL sub operating system of Windows, which does not support 32-bit programs. Please refer to here To modify.

I've been doing this for two days, and I've encountered some errors. It's very difficult to debug, but I've made some results. I don't want full marks, so I'll strengthen my understanding of the dynamic allocator.

The experiment is mainly to let us implement a dynamic distributor to realize mm_init,mm_malloc,mm_free and mm_realloc function. Two simple validation files, short1 - BAL, are then provided Rep and short2 BAL Rep to test the memory utilization and throughput of our algorithm. We can call/ mdriver -f short1-bal.rep -V to view the test results of a single file. Then someone on github uploaded other test data of the course, which can be downloaded from here Download, get a trace folder, and then call. mdriver -t ./trace -V to view the test results.

First, we use the data structure of blocks with feet, as shown below. And set the pointer bp to the block to point to the payload, so that the payload in the block can be accessed directly through bp.

Based on this, we can determine some macros

//Word size and doubleword size
#define WSIZE 4
#define DSIZE 8
//The heap space requested from the kernel when the heap memory is insufficient
#define CHUNKSIZE (1<<12)
//Put val into the 4 bytes beginning with p
#define PUT(p,val) (*(unsigned int*)(p) = (val))
//Get head and foot coding
#define PACK(size, alloc) ((size) | (alloc))
//Obtain block size and allocated bits from the head or feet
#define GET_SIZE(p) (*(unsigned int*)(p) & ~0x7)
#define GET_ALLO(p) (*(unsigned int*)(p) & 0x1)
//Get the head and feet of the block
#define HDRP(bp) ((char*)(bp) - WSIZE)
#define FTRP(bp) ((char*)(bp) + GET_SIZE(HDRP(bp)) - DSIZE)
//Gets the previous block and the next block
#define NEXT_BLKP(bp) ((char*)(bp) + GET_SIZE(HDRP(bp)))
#define PREV_BLKP(bp) ((char*)(bp) - GET_SIZE((char*)(bp) - DSIZE))

#define MAX(x,y) ((x)>(y)?(x):(y))

**Note: * * the bp pointer we passed in may be of void * type. If bp is calculated, its forced type should be converted to char *, so the value of addition and subtraction is the number of bytes.

//Payload pointing to the preamble block of the implicit free linked list
static char *heap_listp;
/* 
 * mm_init - initialize the malloc package.
 */
int mm_init(void){
	if((heap_listp = mem_sbrk(4*WSIZE)) == (void*)-1)	//Apply for 4-word space
		return -1;
	PUT(heap_listp, 0);	//Filler block
	PUT(heap_listp+1*WSIZE, PACK(DSIZE, 1));	//Preamble block head
	PUT(heap_listp+2*WSIZE, PACK(DSIZE, 1));	//Foreword block foot
	PUT(heap_listp+3*WSIZE, PACK(0, 1));		//End block
	
	heap_listp += DSIZE;	//Pointer to preamble payload
	
	if(expend_heap(CHUNKSIZE/WSIZE) == NULL)	//Request more heap space
		return -1;
	return 0;
}

This part is used to create the initial implicit free list. Our implicit free list has the following structure

First, you need an allocated preamble block containing the head and feet, which will never be released, with a size of 8 bytes, as the beginning of the implicit free linked list. Then there are some ordinary blocks, including allocated blocks and free blocks. Finally, there is an allocated end block with block size of 0, which only contains the header and size of 4 bytes. As the end of the implicit free linked list, we will see why the end block is set like this later.

Now, there are three words in the ordinary block 1 plus the preamble block and its own header. In order to ensure that the payload of the block is double word aligned, a block of one word is filled at the beginning of the heap.

Then we make a pointer heap_listp points to the payload part of the preamble block as the starting pointer of the implicit free linked list. Then, there is no part of the implicit free linked list that can hold other data, so expand is called_ Heap to apply for more heap space. Here, a fixed size space is applied at one time, which is defined by CHUNKSIZE.

static void *expend_heap(size_t words){
	size_t size;
	void *bp;
	
	size = words%2 ? (words+1)*WSIZE : words*WSIZE;	//Align size doubleword
	if((bp = mem_sbrk(size)) == (void*)-1)	//Application space
		return NULL;
	
	PUT(HDRP(bp), PACK(size, 0));	//Set header
	PUT(FTRP(bp), PACK(size, 0));	//Set feet
	PUT(HDRP(NEXT_BLKP(bp)), PACK(0, 1));	//Set new end block
	
	//Merge now
	return imme_coalesce(bp);
	//return bp;
}

This function passes in the number of words. First, ensure that the number of words is doubleword aligned, and then apply for the corresponding heap space. Next, take the requested heap space as a free block and set the head and feet. Note that the relationship between the bp pointer and the implicit free linked list is as follows

At this point, we call PUT(HDRP(bp),PACK(size,0)); To set the header of the new free block, it can be found that the previous end block is taken as the header of the current free block, while PUT(HDRP(NEXT_BLKP(bp)),PACK(0,1)); Is to take the last word as the ending block. This makes full use of the original ending block space.

At this time, the free block may also be preceded by a free block, so you can call imme_coalesce(bp) for immediate consolidation.

static void *imme_coalesce(void *bp){
	size_t prev_alloc = GET_ALLO(FTRP(PREV_BLKP(bp)));	//Gets the allocated bit of the previous block
	size_t next_alloc = GET_ALLO(HDRP(NEXT_BLKP(bp)));	//Gets the allocated bit of the following block
	size_t size = GET_SIZE(HDRP(bp));	//Gets the size of the current block
	
	if(prev_alloc && next_alloc){
		return bp;
	}else if(prev_alloc && !next_alloc){
		size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
		PUT(HDRP(bp), PACK(size, 0));
		PUT(FTRP(bp), PACK(size, 0));
	}else if(!prev_alloc && next_alloc){
		size += GET_SIZE(FTRP(PREV_BLKP(bp)));
		PUT(HDRP(PREV_BLKP(bp)), PACK(size, 0));
		PUT(FTRP(bp), PACK(size, 0));
		bp = PREV_BLKP(bp);
	}else{
		size += GET_SIZE(HDRP(NEXT_BLKP(bp))) +
				GET_SIZE(FTRP(PREV_BLKP(bp)));
		PUT(HDRP(PREV_BLKP(bp)), PACK(size, 0));
		PUT(FTRP(NEXT_BLKP(bp)), PACK(size, 0));
		bp = PREV_BLKP(bp);
	}
	return bp;
}

This function determines how to merge according to the allocated bits of the previous block and the next block of bp, as shown below

In fact, we only need to modify the block size fields in the head and feet of the corresponding block, and then modify bp as needed to point to the merged free block.

Then we can see our mm_malloc function

 void *mm_malloc(size_t size){
	size_t asize;
	void *bp;
	
	if(size == 0)
		return NULL;
	//Meet the minimum block requirements and alignment requirements, and size is the payload size
	asize = size<=DSIZE ? 2*DSIZE : DSIZE * ((size + (DSIZE) + (DSIZE-1)) / DSIZE);
	//First match
	if((bp = first_fit(asize)) != NULL){
		place(bp, asize);
		return bp;
	}
	//Best match
	/*if((bp = best_fit(asize)) != NULL){
		place(bp, asize);
		return bp;
	}*/
	//Postpone merger
	//delay_coalesce();
	//Best match
	/*if((bp = best_fit(asize)) != NULL){
		place(bp, asize);
		return bp;
	}*/
	//First match
	/*if((bp = first_fit(asize)) != NULL){
		place(bp, asize);
		return bp;
	}*/
	if((bp = expend_heap(MAX(CHUNKSIZE, asize)/WSIZE)) == NULL)
		return NULL;
	place(bp, asize);
	return bp;
}

First, mm_ The size parameter size passed by malloc refers to the payload of the block. When we search for the free block, the size of the free block includes the head, payload and feet. Therefore, we need to add the size of these two parts and align them in double words to get the size asize for comparison. Then we can use asize to search for suitable free blocks. There are two strategies: first adaptation and best adaptation. Moreover, if we use delayed merging of free blocks, if we can't find a suitable free block, we need to delay merging and find it again. If we still can't find it, it means that there is not enough heap space. At this time, we need to apply for heap space again, and then put the space we want into the free block.

First look at the first adaptation

static void *first_fit(size_t asize){
	void *bp = heap_listp;
	size_t size;
	while((size = GET_SIZE(HDRP(bp))) != 0){	//Traverse all blocks
		if(size >= asize && !GET_ALLO(HDRP(bp)))	//Look for free blocks larger than asize
			return bp;
		bp = NEXT_BLKP(bp);
	}
	return NULL;
} 

Take the end block of the implicit free linked list as the end, and judge the blocks in the linked list in turn. If there is a free block larger than asize, it will be returned directly.

We can also see the best fit

static void *best_fit(size_t asize){
	void *bp = heap_listp;
	size_t size;
	void *best = NULL;
	size_t min_size = 0;
	
	while((size = GET_SIZE(HDRP(bp))) != 0){
		if(size >= asize && !GET_ALLO(HDRP(bp)) && (min_size == 0 || min_size>size)){	//Record the smallest suitable free block
			min_size = size;
			best = bp;
		}
		bp = NEXT_BLKP(bp);
	}
	return best;
} 

It will search for the smallest suitable free block, which can reduce the generation of fragments and improve memory utilization.

When we find a suitable free block, we need to put the space we need into the free block

static void place(void *bp, size_t asize){
	size_t remain_size;
	remain_size = GET_SIZE(HDRP(bp)) - asize;	//Calculate the remaining space after asize is removed from the free block
	if(remain_size >= DSIZE){	//If the remaining space meets the minimum block size, it is regarded as a new free block
		PUT(HDRP(bp), PACK(asize, 1));
		PUT(FTRP(bp), PACK(asize, 1));
		PUT(HDRP(NEXT_BLKP(bp)), PACK(remain_size, 0));
		PUT(FTRP(NEXT_BLKP(bp)), PACK(remain_size, 0));
	}else{
		PUT(HDRP(bp), PACK(GET_SIZE(HDRP(bp)), 1));
		PUT(FTRP(bp), PACK(GET_SIZE(HDRP(bp)), 1));
	}
}

First, we need to calculate the remaining space of the free block after asize is removed. If the remaining space can fill the head and feet to form a new free block, the free block will be divided. Otherwise, the whole free block will be used to set the allocated bit of the block.

Then you can look at the code for deferred merging

static void *delay_coalesce(){
	void *bp = heap_listp;
	while(GET_SIZE(HDRP(bp)) != 0){
		if(!GET_ALLO(HDRP(bp)))
			bp = imme_coalesce(bp);
		bp = NEXT_BLKP(bp);
	}
}

Traverse all blocks of the free linked list. If it is a free block, merge it with the surrounding blocks.

Next, let's take a look at our mm_free

/*
 * mm_free - Freeing a block does nothing.
 */
void mm_free(void *ptr){
	size_t size = GET_SIZE(HDRP(ptr));
	PUT(HDRP(ptr), PACK(size, 0));
	PUT(FTRP(ptr), PACK(size, 0));
	
	//Merge now
	imme_coalesce(ptr);
} 

We first set the allocated bit of the block to the idle state, and then merge it immediately.

Finally, it's our mm_realloc

 /*
 * mm_realloc - Implemented simply in terms of mm_malloc and mm_free
 */
void *mm_realloc(void *ptr, size_t size){
	size_t asize, ptr_size;
	void *new_bp;
	
	if(ptr == NULL)
		return mm_malloc(size);
	if(size == 0){
		mm_free(ptr);
		return NULL;
	}
	
	asize = size<=DSIZE ? 2*DSIZE : DSIZE * ((size + (DSIZE) + (DSIZE-1)) / DSIZE);
	new_bp = imme_coalesce(ptr);	//Try if you have free
	ptr_size = GET_SIZE(HDRP(new_bp));
	PUT(HDRP(new_bp), PACK(ptr_size, 1));
	PUT(FTRP(new_bp), PACK(ptr_size, 1));
	if(new_bp != ptr)	//If the previous free blocks are merged, the original content is moved forward
		memcpy(new_bp, ptr, GET_SIZE(HDRP(ptr)) - DSIZE);
	
	if(ptr_size == asize)
		return new_bp;
	else if(ptr_size > asize){
		place(new_bp, asize);
		return new_bp;
	}else{
		ptr = mm_malloc(asize);
		if(ptr == NULL)
			return NULL;
		memcpy(ptr, new_bp, ptr_size - DSIZE);
		mm_free(new_bp);
		return ptr;
	}
}

First, if ptr is NULL, the space of size is allocated directly. If size is 0, the space pointed to by ptr is released directly. Otherwise, mm is required_ realloc. When we execute mm_ In realloc, ptr refers to the allocated block, and there may be free blocks around it. After adding free blocks, it may meet the size requirements of asize, so we can try to merge ptr with the surrounding free blocks first.

Then here are the experimental results

First fit + immediate merge:

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   99%    5694  0.014197   401
 1       yes   99%    5848  0.013547   432
 2       yes   99%    6648  0.021787   305
 3       yes  100%    5380  0.016372   329
 4       yes   66%   14400  0.001061 13570
 5       yes   92%    4800  0.014609   329
 6       yes   92%    4800  0.013454   357
 7       yes   55%   12000  0.203199    59
 8       yes   51%   24000  0.572411    42
 9       yes   44%   14401  0.152757    94
10       yes   45%   14401  0.029385   490
Total          77%  112372  1.052780   107

Perf index = 46 (util) + 7 (thru) = 53/100

First match + delayed merge:

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   99%    5694  0.033440   170
 1       yes   99%    5848  0.028946   202
 2       yes   99%    6648  0.062765   106
 3       yes   99%    5380  0.056375    95
 4       yes   66%   14400  0.001050 13721
 5       yes   92%    4800  0.029646   162
 6       yes   90%    4800  0.027486   175
 7       yes   60%   12000  0.214382    56
 8       yes   53%   24000  0.584505    41
 9       yes   35%   14401  0.813350    18
10       yes   45%   14401  0.019537   737
Total          76%  112372  1.871480    60

Perf index = 46 (util) + 4 (thru) = 50/100

Best bet + merge now:

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   99%    5694  0.015399   370
 1       yes   99%    5848  0.014503   403
 2       yes   99%    6648  0.023172   287
 3       yes  100%    5380  0.017809   302
 4       yes   66%   14400  0.001038 13878
 5       yes   96%    4800  0.030279   159
 6       yes   95%    4800  0.028814   167
 7       yes   55%   12000  0.201558    60
 8       yes   51%   24000  0.601187    40
 9       yes   40%   14401  0.003348  4302
10       yes   45%   14401  0.002100  6856
Total          77%  112372  0.939206   120

Perf index = 46 (util) + 8 (thru) = 54/100

Best fit + delayed merge:

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   99%    5694  0.038064   150
 1       yes   99%    5848  0.036293   161
 2       yes   99%    6648  0.073700    90
 3       yes   99%    5380  0.072891    74
 4       yes   66%   14400  0.001064 13535
 5       yes   95%    4800  0.048523    99
 6       yes   94%    4800  0.046162   104
 7       yes   60%   12000  0.223058    54
 8       yes   53%   24000  0.592458    41
 9       yes   65%   14401  0.004285  3361
10       yes   76%   14401  0.001220 11800
Total          82%  112372  1.137718    99

Perf index = 49 (util) + 7 (thru) = 56/100

Separate free linked list

For the separated free linked list, the data structure of the block needs to be determined first

Here, the free block records the precursor and subsequent free blocks of the free block in the second word and the third word. Therefore, all free blocks can be displayed and linked through the pointer, and all free blocks can be traversed directly through the pointer.

We define the following macros

//Word size and doubleword size
#define WSIZE 4
#define DSIZE 8
//The heap space requested from the kernel when the heap memory is insufficient
#define CHUNKSIZE (1<<12)
//Put val into the 4 bytes beginning with p
#define PUT(p,val) (*(unsigned int*)(p) = (val))
#define GET(p) (*(unsigned int*)(p))
//Get head and foot coding
#define PACK(size, alloc) ((size) | (alloc))
//Obtain block size and allocated bits from the head or feet
#define GET_SIZE(p) (GET(p) & ~0x7)
#define GET_ALLO(p) (GET(p) & 0x1)
//Get the head and feet of the block
#define HDRP(bp) ((char*)(bp) - WSIZE)
#define FTRP(bp) ((char*)(bp) + GET_SIZE(HDRP(bp)) - DSIZE)
//Gets the previous block and the next block
#define NEXT_BLKP(bp) ((char*)(bp) + GET_SIZE(HDRP(bp)))
#define PREV_BLKP(bp) ((char*)(bp) - GET_SIZE((char*)(bp) - DSIZE))

//Obtain the address of the successor and predecessor recorded in the block
#define PRED(bp) ((char*)(bp) + WSIZE)
#define SUCC(bp) ((char*)bp)
//Obtain the addresses of the successors and precursors of the block
#define PRED_BLKP(bp) (GET(PRED(bp)))
#define SUCC_BLKP(bp) (GET(SUCC(bp)))

#define MAX(x,y) ((x)>(y)?(x):(y))

Here bp points to the position of a word behind the header. We use the first word to record the subsequent address of the free block and the second block to record the address of the precursor. Why? It will be explained later.

Then we need to determine the size class of the separated free linked list. Since a free block contains the head, successor, precursor and foot, and needs at least 16 bytes, the minimum block of the free block is 16 bytes. If it is less than 16 bytes, the complete free block content cannot be recorded, so the size class is set to

{16-31},{32-63},{64-127},{128-255},{256-511},{512-1023},{1024-2047},{2048-4095},{4096-inf}

We need to include the root node of these size classes at the beginning of the heap and point to their corresponding free linked list, so the root needs a word of space to save the address. Secondly, you still need to save the preamble block and the ending block as a flag for moving between blocks. So mm_init is as follows

static char *heap_listp;
static char *listp;

/* 
 * mm_init - initialize the malloc package.
 */
int mm_init(void){
	if((heap_listp = mem_sbrk(12*WSIZE)) == (void*)-1)
		return -1;
	//The smallest block of the free block contains the head, precursor, successor and foot, with 16 bytes
	PUT(heap_listp+0*WSIZE, NULL);	//{16~31}
	PUT(heap_listp+1*WSIZE, NULL);	//{32~63}
	PUT(heap_listp+2*WSIZE, NULL);	//{64~127}
	PUT(heap_listp+3*WSIZE, NULL);	//{128~255}
	PUT(heap_listp+4*WSIZE, NULL);	//{256~511}
	PUT(heap_listp+5*WSIZE, NULL);	//{512~1023}
	PUT(heap_listp+6*WSIZE, NULL);	//{1024~2047}
	PUT(heap_listp+7*WSIZE, NULL);	//{2048~4095}
	PUT(heap_listp+8*WSIZE, NULL);	//{4096~inf}
	
	//Or do you want to include a preamble block and an ending block
	PUT(heap_listp+9*WSIZE, PACK(DSIZE, 1));
	PUT(heap_listp+10*WSIZE, PACK(DSIZE, 1));
	PUT(heap_listp+11*WSIZE, PACK(0, 1));
	
	listp = heap_listp;
	heap_listp += 10*WSIZE;
	
	
	if(expend_heap(CHUNKSIZE/WSIZE) == NULL)
		return -1;
	return 0;
}

Here, we first apply for a space of 12 words, and then the next 9 save the root pointers of each size class in turn, initially NULL. Then, the last three words are used to save the preamble block and the ending block. Let listp point to the starting position of the size class array and heap_listp points to the payload of the preamble block, and then calls expend_. Heap to request heap space** Note: * * the root pointer is equivalent to only subsequent blocks, so you can view subsequent blocks through the SUCC macro.

static void *expend_heap(size_t words){
	size_t size;
	void *bp;
	
	size = words%2 ? (words+1)*WSIZE : words*WSIZE;
	if((bp = mem_sbrk(size)) == (void*)-1)
		return NULL;
	PUT(HDRP(bp), PACK(size, 0));
	PUT(FTRP(bp), PACK(size, 0));
	PUT(HDRP(NEXT_BLKP(bp)), PACK(0, 1));
	
	PUT(PRED(bp), NULL);
	PUT(SUCC(bp), NULL);
	
	//Merge now
	bp = imme_coalesce(bp);
	bp = add_block(bp);
	return bp;
}

First, get the aligned size, and then set the head, foot and end blocks of the free block like the implicit free linked list. Then, since the free block has not been inserted into the free list, the first and subsequent pointer of the free block is set to NULL first, and then the imme_ is called. The coalesce function merges the free blocks immediately, and then calls add_ The block function inserts the free block into the free linked list of the appropriate size class.

static void *imme_coalesce(void *bp){
	size_t prev_alloc = GET_ALLO(FTRP(PREV_BLKP(bp)));
	size_t next_alloc = GET_ALLO(HDRP(NEXT_BLKP(bp)));
	size_t size = GET_SIZE(HDRP(bp));
	
	if(prev_alloc && next_alloc){
		return bp;
	}else if(prev_alloc && !next_alloc){
		size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
		delete_block(NEXT_BLKP(bp));
		PUT(HDRP(bp), PACK(size, 0));
		PUT(FTRP(bp), PACK(size, 0));
	}else if(!prev_alloc && next_alloc){
		size += GET_SIZE(FTRP(PREV_BLKP(bp)));
		delete_block(PREV_BLKP(bp));
		PUT(HDRP(PREV_BLKP(bp)), PACK(size, 0));
		PUT(FTRP(bp), PACK(size, 0));
		bp = PREV_BLKP(bp);
	}else{
		size += GET_SIZE(HDRP(NEXT_BLKP(bp))) +
				GET_SIZE(FTRP(PREV_BLKP(bp)));
		delete_block(NEXT_BLKP(bp));
		delete_block(PREV_BLKP(bp));
		PUT(HDRP(PREV_BLKP(bp)), PACK(size, 0));
		PUT(FTRP(NEXT_BLKP(bp)), PACK(size, 0));
		bp = PREV_BLKP(bp);
	}
	return bp;
}

Firstly, the allocated bits of adjacent blocks before and after the free block bp are found, which are divided into four cases according to the combination of allocated bits.

It can be found that if there are free blocks around, you need to delete them first_ The block function removes the free block from the display free list and adjusts the part of the pointer in the graph, then sets the corresponding head and foot, so that the free block is merged and bp points to the payload part of the new free block.

 static void delete_block(void *bp){
	PUT(SUCC(PRED_BLKP(bp)), SUCC_BLKP(bp));
	if(SUCC_BLKP(bp)!=NULL)
		PUT(PRED(SUCC_BLKP(bp)), PRED_BLKP(bp));
}

Deleting the specified free block from the displayed free linked list is actually very simple. It is just to adjust the pointers of the precursor and successor to skip the current free block.

After merging the free blocks, we need to insert them into the display free linked list of appropriate size classes

static void *add_block(void *bp){
	size_t size = GET_SIZE(HDRP(bp));
	int index = Index(size);
	void *root = listp+index*WSIZE;
	
	//LIFO
	return LIFO(bp, root);
        //AddressOrder
	//return AddressOrder(bp, root);
}

When inserting a free block into the display free linked list, you first need to determine the size class of the free block

static int Index(size_t size){
	int ind = 0;
	if(size >= 4096)
		return 8;
	
	size = size>>5;
	while(size){
		size = size>>1;
		ind++;
	}
	return ind;
}

Thus, the root pointer of the display free linked list corresponding to the size class can be obtained. At this time, two strategies for inserting free blocks in the display free linked list are provided: LIFO strategy and address order strategy.

static void *LIFO(void *bp, void *root){
	if(SUCC_BLKP(root)!=NULL){
		PUT(PRED(SUCC_BLKP(root)), bp);	//SUCC->BP
		PUT(SUCC(bp), SUCC_BLKP(root));	//BP->SUCC
	}else{
		PUT(SUCC(bp), NULL);	//Missing this!!!!
	}
	PUT(SUCC(root), bp);	//ROOT->BP
	PUT(PRED(bp), root);	//BP->ROOT
	return bp;
}

LIFO strategy is to insert free blocks directly into the header node. Note that when there is no successor node after root, it means that bp is directly found after root. At this time, remember to set the successor node of bp to NULL.

 static void *AddressOrder(void *bp, void *root){
	void *succ = root;
	while(SUCC_BLKP(succ) != NULL){
		succ = SUCC_BLKP(succ);
		if(succ >= bp){
			break;
		}
	}
	if(succ == root){
		return LIFO(bp, root);
	}else if(SUCC_BLKP(succ) == NULL){
		PUT(SUCC(succ), bp);
		PUT(PRED(bp), succ);
		PUT(SUCC(bp), NULL);
	}else{
		PUT(SUCC(PRED_BLKP(succ)), bp);
		PUT(PRED(bp), PRED_BLKP(succ));
		PUT(SUCC(bp), succ);
		PUT(PRED(succ), bp);
	}
	return bp;
}

The address order is to make the free block addresses in the displayed free linked list increase in turn. The first adaptation of this strategy will have higher memory utilization than the first adaptation of LIFO.

Then we can look at our mm_malloc function

void *mm_malloc(size_t size){
	size_t asize;
	void *bp;
	
	if(size == 0)
		return NULL;
	//Meet the minimum block requirements and alignment requirements, and size is the payload size
	asize = size<=DSIZE ? 2*DSIZE : DSIZE * ((size + (DSIZE) + (DSIZE-1)) / DSIZE);
	//First match
	if((bp = first_fit(asize)) != NULL){
		place(bp, asize);
		return bp;
	}
	//Best match
	/*if((bp = best_fit(asize)) != NULL){
		place(bp, asize);
		return bp;
	}*/
	if((bp = expend_heap(MAX(CHUNKSIZE, asize)/WSIZE)) == NULL)
		return NULL;
	place(bp, asize);
	return bp;
} 

First, calculate the size of the free block that meets the minimum block and its requirements, * * Note: * * the free block needs two additional words to save the precursor and subsequent pointers, but the allocated block does not need it, so the asize here is still the same as that of the implicit free linked list.

Two strategies, first matching and best matching, are provided here

static void *first_fit(size_t asize){
	int ind = Index(asize);
	void *succ;
	while(ind <= 8){
		succ = listp+ind*WSIZE;
		while((succ = SUCC_BLKP(succ)) != NULL){
			if(GET_SIZE(HDRP(succ)) >= asize && !GET_ALLO(HDRP(succ))){
				return succ;
			}
		}
		ind+=1;
	}
	return NULL;
}

In the first matching, first determine the size class of the idle block of asize size, and then search the display idle linked list corresponding to the size class. If an idle block of appropriate size is found, it will return directly. If the display idle linked list does not find an appropriate idle block, it will traverse the display idle linked list of the next size class, Because the free block of the next size class must be larger than that of the current size class.

 static void *best_fit(size_t asize){
	int ind = Index(asize);
	void *best = NULL;
	int min_size = 0, size;
	void *succ;
	while(ind <= 8){
		succ = listp+ind*WSIZE;
		while((succ = SUCC_BLKP(succ)) != NULL){
			size = GET_SIZE(HDRP(succ));
			if(size >= asize && !GET_ALLO(HDRP(succ)) && (size<min_size||min_size==0)){
				best = succ;
				min_size = size;
			}
		}
		if(best != NULL)
			return best;
		ind+=1;
	}
	return NULL;
}

The best fit is to find the smallest free block that meets the size requirements.

When a suitable free block is found, we need to call the place function to use the free block

static void place(void *bp, size_t asize){
	size_t remain_size;
	remain_size = GET_SIZE(HDRP(bp)) - asize;
	delete_block(bp);
	if(remain_size >= DSIZE*2){	//division
		PUT(HDRP(bp), PACK(asize, 1));
		PUT(FTRP(bp), PACK(asize, 1));
		PUT(HDRP(NEXT_BLKP(bp)), PACK(remain_size, 0));
		PUT(FTRP(NEXT_BLKP(bp)), PACK(remain_size, 0));
		add_block(NEXT_BLKP(bp));
	}else{
		PUT(HDRP(bp), PACK(GET_SIZE(HDRP(bp)), 1));
		PUT(FTRP(bp), PACK(GET_SIZE(HDRP(bp)), 1));
	}
}

In the place function, we first delete the free block from the displayed free linked list, and then judge whether the remaining space meets the minimum requirements of the free block. If so, divide the free block, and then call add for the remaining free blocks_ The block function puts it into the display free linked list of appropriate size classes. If the remaining space is not enough to form a free block, the whole free block is directly used.

Next, let's take a look at our mm_ The free function

/*
 * mm_free - Freeing a block does nothing.
 */
void mm_free(void *ptr){
	size_t size = GET_SIZE(HDRP(ptr));

	PUT(HDRP(ptr), PACK(size, 0));
	PUT(FTRP(ptr), PACK(size, 0));
	
	//Merge now
	ptr = imme_coalesce(ptr);
	add_block(ptr);
}
 

The function first modifies the head and foot of the allocated block, then sets it to the free block, and then calls imme_. The coalesce function is merged immediately, and then add_ is called. The block function inserts it into the appropriate position in the display free linked list of the appropriate size class.

Next, let's look at our mm_realloc function

/*
 * mm_realloc - Implemented simply in terms of mm_malloc and mm_free
 */
void *mm_realloc(void *ptr, size_t size){
	size_t asize, ptr_size, remain_size;
	void *new_bp;
	
	if(ptr == NULL){
		return mm_malloc(size);
	}
	if(size == 0){
		mm_free(ptr);
		return NULL;
	}
	
	asize = size<=DSIZE ? 2*DSIZE : DSIZE * ((size + (DSIZE) + (DSIZE-1)) / DSIZE);
	new_bp = imme_coalesce(ptr);	//Try if you have free
	ptr_size = GET_SIZE(HDRP(new_bp));
	PUT(HDRP(new_bp), PACK(ptr_size, 1));
	PUT(FTRP(new_bp), PACK(ptr_size, 1));
	if(new_bp != ptr)
		memcpy(new_bp, ptr, GET_SIZE(HDRP(ptr)) - DSIZE);
	
	if(ptr_size == asize){
		return new_bp;
	}else if(ptr_size > asize){
		remain_size = ptr_size - asize;
		if(remain_size >= DSIZE*2){	//division
			PUT(HDRP(new_bp), PACK(asize, 1));
			PUT(FTRP(new_bp), PACK(asize, 1));
			PUT(HDRP(NEXT_BLKP(new_bp)), PACK(remain_size, 0));
			PUT(FTRP(NEXT_BLKP(new_bp)), PACK(remain_size, 0));
			add_block(NEXT_BLKP(new_bp));
		}
		return new_bp;
	}else{
		if((ptr = mm_malloc(asize)) == NULL)
			return NULL;
		memcpy(ptr, new_bp, ptr_size - DSIZE);
		mm_free(new_bp);
		return ptr;
	}
} 

Here is the same idea as before.

Secondly, discrete free linked lists are prone to errors in pointers. Here is a code for displaying free linked lists of various sizes to detect whether there are pointer errors

static void print_listp(){
	int ind;
	void *node, *root;
	printf("print listp\n");
	for(ind=1;ind<=8;ind++){
		node = listp+ind*WSIZE;
		root = listp+ind*WSIZE;
		printf("%d:\n",ind);
		while(SUCC_BLKP(node)){
			node = SUCC_BLKP(node);
			printf("-->%p,%d",node, GET_SIZE(HDRP(node)));
		}
		printf("-->%p\n",SUCC_BLKP(node));
		while(node!=root){
			printf("<--%p,%d",node, GET_SIZE(HDRP(node)));
			node = PRED_BLKP(node);
		}
		printf("<--%p\n",node);
	}
}

Then here are the experimental results

Merge now + first fit + LIFO:

trace  valid  util     ops      secs  Kops
 0       yes   98%    5694  0.000925  6158
 1       yes   94%    5848  0.001066  5487
 2       yes   98%    6648  0.001315  5054
 3       yes   99%    5380  0.000914  5887
 4       yes   66%   14400  0.001441  9991
 5       yes   89%    4800  0.001068  4495
 6       yes   85%    4800  0.001189  4037
 7       yes   55%   12000  0.001854  6474
 8       yes   51%   24000  0.003982  6027
 9       yes   48%   14401  0.153353    94
10       yes   45%   14401  0.034466   418
Total          75%  112372  0.201572   557

Perf index = 45 (util) + 37 (thru) = 82/100

Merge now + best fit + LIFO:

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   99%    5694  0.001026  5548
 1       yes   99%    5848  0.001052  5558
 2       yes   99%    6648  0.001170  5682
 3       yes  100%    5380  0.001103  4878
 4       yes   66%   14400  0.001601  8993
 5       yes   96%    4800  0.002048  2344
 6       yes   95%    4800  0.001930  2487
 7       yes   55%   12000  0.001934  6206
 8       yes   51%   24000  0.004590  5229
 9       yes   40%   14401  0.004290  3357
10       yes   45%   14401  0.003162  4555
Total          77%  112372  0.023907  4700

Perf index = 46 (util) + 40 (thru) = 86/100

Merge now + best fit + AddressOrder:

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   99%    5694  0.001086  5244
 1       yes   99%    5848  0.001069  5473
 2       yes   99%    6648  0.001304  5097
 3       yes  100%    5380  0.001135  4738
 4       yes   66%   14400  0.001509  9544
 5       yes   96%    4800  0.003104  1546
 6       yes   95%    4800  0.003175  1512
 7       yes   55%   12000  0.020766   578
 8       yes   51%   24000  0.071639   335
 9       yes   40%   14401  0.004262  3379
10       yes   45%   14401  0.003016  4775
Total          77%  112372  0.112065  1003

Perf index = 46 (util) + 40 (thru) = 86/100

Merge now + first fit + AddressOrder:

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   99%    5694  0.000945  6023
 1       yes   99%    5848  0.001001  5843
 2       yes   99%    6648  0.001177  5649
 3       yes  100%    5380  0.000948  5676
 4       yes   66%   14400  0.001481  9720
 5       yes   93%    4800  0.002984  1608
 6       yes   91%    4800  0.002907  1651
 7       yes   55%   12000  0.020213   594
 8       yes   51%   24000  0.070960   338
 9       yes   40%   14401  0.004211  3420
10       yes   45%   14401  0.002934  4908
Total          76%  112372  0.109762  1024

Perf index = 46 (util) + 40 (thru) = 86/100

Show free linked list

This time, malloc lab wrote a memory allocator by itself.

1. Experimental purpose

malloclab, in short, is to implement its own malloc, free and realloc functions. After this experiment, you can deepen your understanding of pointers and master some core concepts in memory allocation, such as how to organize heap, how to find available free block s, and adopt first fit, next fit and best fit? How to trade off between throughput and memory utilization.

In my personal opinion, the basic knowledge of malloclab is not difficult, but the code is full of a large number of pointer operations. In order to avoid hard coding pointer operations, some macros will be defined, Defining macros to operate will increase the difficulty of debugging (of course, great gods such as linus will think, why use debug when the code is written?). Debugging can only rely on gdb and print, so it is still difficult as a whole.

2. Background knowledge

Here is a brief introduction to the background knowledge needed to do this experiment.

First, what problems need to be solved in order to write an allocator. csapp some key issues are listed in the ppt in this chapter:

The first question is, how does a route like free (ptr) know the size of the block released this time?

Obviously, we need some additional information to store the meta information of the block. The specific method is to add a word in front of the block to store the allocated size and whether it has been allocated.

Note: This is only the simplest case. In practice, these are not the only additional metadata

The second question, how to track free blocks?

csapp gives a total of four schemes. The source code of implicit list is given in the book. I personally implemented implicit list and explicit list. The aggregated free list feels that it is possible to encapsulate the explicit list by using the OO idea. The same is true for red and black trees.

The third problem is the split strategy (see the place function of the code for details)

The fourth question, generally speaking, there are first fit, next fit and best fit strategies. I use the simplest first fit strategy here. (this is actually a trade-off problem. It depends on whether you want high throughput or high memory utilization.)

ok, let's take a look at how the implicit list and explicit list are implemented.

3. Implicit list

The following is the organization of an implicit list and the specific situation of a block. A block adopts a bilateral tag to ensure forward and backward indexing.

The advantages of this scheme: simple implementation. Disadvantages: the cost of finding free block is too large.

Now let's talk about some codes given in lab:

  1. memlib, this file gives the method of heap extension. In addition, we can also get the first byte, the last byte, heap size, etc. of the currently available heap. The specific implementation is operated through an sbrk pointer.
  2. mdriver, this file does not need to be read, but only the executable file compiled by it, which is used to test whether the allocator we write is correct.
  3. mm.c. This is the file we want to operate. It mainly implements three functions mm_malloc,mm_free,mm_realloc, we define additional functions we need.

OK, let's talk about the specific code, because many pointer operations are involved in the code. We make a peak installation for these operations and use macro definitions to operate them:

#define WSIZE       4       /* Word and header/footer size (bytes) */ 
#define DSIZE       8       /* Double word size (bytes) */
#define CHUNKSIZE  (1<<12)  /* Extend heap by this amount (bytes) */

#define MAX(x, y) ((x) > (y)? (x) : (y))  

/* Pack a size and allocated bit into a word */
#define PACK(size, alloc)  ((size) | (alloc)) 

/* Read and write a word at address p */
#define GET(p)       (*(unsigned int *)(p))           
#define PUT(p, val)  (*(unsigned int *)(p) = (val))   

/* Read the size and allocated fields from address p */
#define GET_SIZE(p)  (GET(p) & ~0x7)                   
#define GET_ALLOC(p) (GET(p) & 0x1)                    

/* Given block ptr bp, compute address of its header and footer */
#define HDRP(bp)       ((char *)(bp) - WSIZE)                     
#define FTRP(bp)       ((char *)(bp) + GET_SIZE(HDRP(bp)) - DSIZE) 

/* Given block ptr bp, compute address of next and previous blocks */
#define NEXT_BLKP(bp)  ((char *)(bp) + GET_SIZE(((char *)(bp) - WSIZE)))
#define PREV_BLKP(bp)  ((char *)(bp) - GET_SIZE(((char *)(bp) - DSIZE)))

Comments give the meaning of each macro.

Some additional definitions:

static void *free_list_head = NULL;	 // The head of the whole list

static void *extend_heap(size_t words);	// Used to expand the heap size when the heap is not allocated enough
static void *coalesce(void *bp);	// During free block, there may be some situations where the front and back are also free blocks. At this time, it is necessary to merge. It is not allowed to have two consecutive free blocks on a list at the same time
static void *find_fit(size_t size);	// Find the block on the list that can satisfy the malloc request
static void place(void *bp, size_t size); // Place the current block, if size < current block size - MIN_BLOCK, split is required

mm_init

This function initializes mm, including:

  1. Four words are allocated, and the 0th word is pad. For the subsequent allocated block, the first address of payload can be 8-byte aligned.
  2. Words 1-2 are the preamble block, free_list_head points here, which is equivalent to giving a definition to list. Otherwise, where do we start search?
  3. The third word, the ending block, is mainly used to determine the tail boundary.
  4. extend_heap: allocate a large heap for subsequent malloc requests.
/* 
 * mm_init - initialize the malloc package.
 */
int mm_init(void)
{
   // Create the inital empty heap
    if( (free_list_head = mem_sbrk(4 * WSIZE)) == (void *)-1 ){
        return -1;
    }

    PUT(free_list_head, 0);
    PUT(free_list_head + (1 * WSIZE), PACK(DSIZE, 1));
    PUT(free_list_head + (2 * WSIZE), PACK(DSIZE, 1));
    PUT(free_list_head + (3 * WSIZE), PACK(0, 1));
    free_list_head += (2 * WSIZE);

    // Extend the empty heap with a free block of CHUNKSIZE bytes
    if(extend_heap(CHUNKSIZE/WSIZE) == NULL){
        return -1;
    }
    return 0;
}

extend_heap

Work:

  1. Update the size to ensure that the size is an even number of word s
  2. Add metadata for the currently allocated block, that is, header and footer information
  3. Update tail boundary
static void *extend_heap(size_t words)
{
    char *bp;
    size_t size;

    /* Allocate an even number of words to maintain alignment */
    size = (words % 2) ? (words + 1) * WSIZE : words * WSIZE;
    if( (long)(bp = mem_sbrk(size)) == -1 ){
        return NULL;
    }

    // Initialize header/footer and epilogue header of free block
    PUT(HDRP(bp), PACK(size, 0));
    PUT(FTRP(bp), PACK(size, 0));
    PUT(HDRP(NEXT_BLKP(bp)), PACK(0, 1));

    // Coalesce if the previous block was free
    return coalesce(bp);
}

mm_malloc

mm_malloc is also relatively simple. First, change the request size to meet the overhead requirements of 8-byte alignment + metadata. Then try to find out whether there is a block in the currently available heap that can meet the request. If there is, place directly. If not, expand the size of the currently available heap and then place.

/* 
 * mm_malloc - Allocate a block by incrementing the brk pointer.
 *     Always allocate a block whose size is a multiple of the alignment.
 */
void *mm_malloc(size_t size)
{
    size_t asize;      // Adjusted block size
    size_t extendsize; // Amount to extend heap if no fit
    char *bp;

    if (size == 0)
        return NULL;

    // Ajust block size to include overhea and alignment reqs;
    if (size <= DSIZE)
    {
        asize = 2 * DSIZE;
    }
    else
    {
        asize = DSIZE * ((size + (DSIZE) + (DSIZE - 1)) / DSIZE); // If it exceeds 8 bytes, plus the overhead of header/footer block, the upward rounding is guaranteed to be a multiple of 8
    }

    // Search the free list for a fit
    if ((bp = find_fit(asize)) != NULL)
    {
        place(bp, asize);
    }
    else
    {
        // No fit found. Get more memory and place the block
        extendsize = MAX(asize, CHUNKSIZE);
        if ((bp = extend_heap(extendsize / WSIZE)) == NULL)
        {
            return NULL;
        }
        place(bp, asize);
    }

#ifdef DEBUG
    printf("malloc\n");
    print_allocated_info();
#endif
    return bp;
}

find_fit

Traverse the whole list, find the block that has not been allocated and meets the current request size, and then return the first address of the block.

/**
 * @brief Use first fit policy
 * 
 * @param size 
 * @return void* If successful, the first address of the available block is returned
 *               Failed, NULL returned
 */
static void *find_fit(size_t size)
{
    void *bp ;      

    for (bp = NEXT_BLKP(free_list_head); GET_SIZE(HDRP(bp)) > 0; bp = NEXT_BLKP(bp))
    {
        if(GET_ALLOC(HDRP(bp)) == 0 && size <= GET_SIZE(HDRP(bp)))
        {
            return bp;
        }
    }
    return NULL;
}

place

The work of place is also simple:

  1. Minimum block size (2 * dsize) < = current block size - current requested block size, split the current block
  2. Otherwise, place directly.

Now continue to look at free:

/**
 * @brief place block
 * 
 * @param bp 
 * @param size 
 */
static void place(void *bp, size_t size)
{
    size_t remain_size;
    size_t origin_size;

    origin_size = GET_SIZE(HDRP(bp));
    remain_size = origin_size - size;
    if(remain_size >= 2*DSIZE)  // For the remaining blocks, at least one double word is required (header / footer occupies a double word, pyaload is not empty, plus alignment requirements)
    {
        PUT(HDRP(bp), PACK(size, 1));
        PUT(FTRP(bp), PACK(size, 1));
        PUT(HDRP(NEXT_BLKP(bp)), PACK(remain_size, 0));
        PUT(FTRP(NEXT_BLKP(bp)), PACK(remain_size, 0));
    }else{
        // Less than one double word, keep the internal fragment
        PUT(HDRP(bp), PACK(origin_size, 1));
        PUT(FTRP(bp), PACK(origin_size, 1));
    }
}

mm_free

As you can see, free is also quite simple. You can update the allocation status of the current block from 1 to 0. Then do the coalesce operation:

/*
 * mm_free - Freeing a block does nothing.
 */
void mm_free(void *ptr)
{
    size_t size = GET_SIZE(HDRP(ptr));

    PUT(HDRP(ptr), PACK(size, 0));
    PUT(FTRP(ptr), PACK(size, 0));
    coalesce(ptr);

#ifdef DEBUG
    printf("free\n");
    print_allocated_info();
#endif
}

coalesce

After free block, consider whether there are free blocks before and after. If there are free blocks, they need to be merged. Four cases are given below:

// Due to the existence of preamble and tail blocks, some boundary checks are avoided.
static void *coalesce(void *bp)
{
    size_t pre_alloc = GET_ALLOC(HDRP(PREV_BLKP(bp)));
    size_t next_alloc = GET_ALLOC(HDRP(NEXT_BLKP(bp)));
    size_t size = GET_SIZE(HDRP(bp));

    if(pre_alloc && next_alloc){    // case1: both front and back allocation
        return bp;
    }
    else if(pre_alloc && !next_alloc){  // case 2: pre allocation, post free
        void *next_block = NEXT_BLKP(bp);
        size += GET_SIZE(HDRP(next_block));
        PUT(HDRP(bp), PACK(size, 0));
        PUT(FTRP(next_block), PACK(size, 0));
        // TODO: don't empty the other two tag s? Under normal circumstances, there is really no need to empty.
    }
    else if(!pre_alloc && next_alloc){  // case 3: pre free, post allocation
        size += GET_SIZE(HDRP(PREV_BLKP(bp)));
        PUT(FTRP(bp), PACK(size, 0));
        PUT(HDRP(PREV_BLKP(bp)), PACK(size, 0));
        bp = PREV_BLKP(bp);
    }
    else {      // Both are free
        size += GET_SIZE(HDRP(PREV_BLKP(bp))) +
                GET_SIZE(FTRP(NEXT_BLKP(bp)));
        PUT(HDRP(PREV_BLKP(bp)), PACK(size, 0));
        PUT(FTRP(NEXT_BLKP(bp)), PACK(size, 0));
        bp = PREV_BLKP(bp);
    }

    return bp;
}

mm_realloc

Realloc function implementation is also very simple, realloc redistributes size blocks, and then copies the old block content to the new block content. Note that the smaller block is also considered here

/*
 * mm_realloc - Implemented simply in terms of mm_malloc and mm_free
 */
void *mm_realloc(void *ptr, size_t size)
{
    void *oldptr = ptr;
    void *newptr;
    size_t copySize;
    
    newptr = mm_malloc(size);
    if (newptr == NULL)
      return NULL;
    copySize = GET_SIZE(HDRP(oldptr));
    if (size < copySize)
      copySize = size;
    memcpy(newptr, oldptr, copySize);
    mm_free(oldptr);
    return newptr;
}

ok, the above is all about implicit list. Let's start to explain the implementation of explicit list

4. explicit list

The difference between explicit list and implicit list is that the former maintains a free list in the logical space, which only stores free blocks, while the latter maintains the whole list in the virtual address space, which contains both free blocks and allocated blocks Of course, the underlying layer of explicit list is also a virtual address space. The following figure shows the upper structure of explicit list:

The specific structure comparison of each block of implicit and explicit is given below:

As you can see, explicit is more implicit. Each block has only two additional fields, which are used to save the address of the next free block (next) and the address of the previous free block (prev)

Consider the advantage of explicit: it greatly improves the efficiency of searching free block s. However, the implementation complexity is more difficult than implicit because there is one more logical space operation

First, how much space does next and prev occupy? For 32-bit os, the address size of the address space is 32 bits (4 bytes), while for 64 bit os, it is 64 bits (8 bytes) For simplicity, only the 32-bit case is considered in this article (the - m32 parameter is added during gcc compilation, and the default makefile has been given)

OK, now determine the size of next and prev, and then determine the size of the smallest block. The smallest block should contain header+footer+next+prev+payload, where the payload is at least 1 byte, and the smallest block should ensure the 8-byte alignment requirements. Based on the above, the smallest block is:
If the byte alignment is taken up, then 4 + 4 + 4 + 1 + 4 = 17; if the byte alignment is taken up, then MINBLOCK=24ok. Now we will explain some provisions in the code:

  1. find strategy adopts the first fit strategy
  2. As for how to re insert the free list into the free block after free, this paper adopts LIFO strategy
  3. Alignment convention, 8-byte alignment

With the above description, you can almost write code. Start with the defined macro:

/* single word (4) or double word (8) alignment */
#define ALIGNMENT 8

/* rounds up to the nearest multiple of ALIGNMENT */
#define ALIGN(size) (((size) + (ALIGNMENT - 1)) & ~0x7)

#define SIZE_T_SIZE (ALIGN(sizeof(size_t)))

#define WSIZE 4             /* Word and header/footer size (bytes) */
#define DSIZE 8
#define CHUNKSIZE (1 << 12) /* Extend heap by this amount (bytes) */

#define MAX(x, y) ((x) > (y) ? (x) : (y))

/* Pack a size and allocated bit into a word */
#define PACK(size, alloc) ((size) | (alloc))

/* Read and write a word at address p */
#define GET(p) (*(unsigned int *)(p))
#define PUT(p, val) (*(unsigned int *)(p) = (unsigned int)(val))

/* Read the size and allocated fields from address p */
#define GET_SIZE(p) (GET(p) & ~0x7)
#define GET_ALLOC(p) (GET(p) & 0x1)

/* Given block ptr bp, compute address of its header and footer */
#define HDRP(bp) ((char *)(bp)-3 * WSIZE)
#define FTRP(bp) ((char *)(bp) + GET_SIZE(HDRP(bp)) - 4 * WSIZE)

// free block operation: calculates the "NEXT" pointer field of the current block
// bp: the payload first address of the current block
#define NEXT_PTR(bp) ((char *)(bp)-2 * WSIZE)
#define PREV_PTR(bp) ((char *)(bp)-WSIZE)

// Free block operation: calculate the payload first address of the next free block
// bp: the payload first address of the current block
#define NEXT_FREE_BLKP(bp) ((char *)(*(unsigned int *)(NEXT_PTR(bp))))
#define PREV_FREE_BLKP(bp) ((char *)(*(unsigned int *)(PREV_PTR(bp))))

// virtual address calculation: calculate the payload first address of the next block
// bp: the payload first address of the current block
#define NEXT_BLKP(bp) ((char *)(bp) + GET_SIZE(HDRP(bp)))
#define PREV_BLKP(bp) ((char *)(bp)-GET_SIZE(HDRP(bp) - WSIZE))

As you can see, it's basically the same as the implicit macro, except for the addition of next_ FREE_ For macros such as blkp, the specific layout of each block is adjusted (next and prev are added), so some operations, such as HDRP, need to be adjusted accordingly

Then there are the functions:

NOTE: again, NOTE that there are two spaces: logical space and virtual address space

mm_init

The main work of this function includes:

  1. Assign a word+ MIN_BLOCK
  2. The first word is used as pad to ensure that the block s allocated later can be aligned with 8 bytes, which is the same as implicit
  3. Min behind_ Block is used as free_list_head is the same as the preamble of implicit
  4. Finally, a chunk is allocated, and the chunk will be inserted into the free list inside the allocation function

Overall, explicit mm_ Mm of init and implicit_ Init has the same effect, but the organization has changed

int mm_init(void)
{
    // Request a block to store the root pointer
    char *init_block_p;
    if ((init_block_p = mem_sbrk(MIN_BLOCK + WSIZE)) == (void *)-1)
    {
        return -1;
    }
    init_block_p = (char *)(init_block_p) + WSIZE; // Skip first alignment block

    free_list_head = init_block_p + 3 * WSIZE;
    PUT(PREV_PTR(free_list_head), NULL);
    PUT(NEXT_PTR(free_list_head), NULL); // Initialize root pointer to NULL (0)
    PUT(HDRP(free_list_head), PACK(MIN_BLOCK, 1));
    PUT(FTRP(free_list_head), PACK(MIN_BLOCK, 1));

    // Extend the empty heap with a free block of CHUNKSIZE bytes
    if ((allocate_from_chunk(CHUNKSIZE)) == NULL)
    {
        return -1;
    }
    return 0;
}

allocate_from_heap

allocate_ from_ The work of heap is very simple. Expand the size of heap, and then insert the extended block into free_ In lilst

/**
 * @brief Expand heap and allocate blocks that meet the current requirements to free_list
 * 
 * @param size  Requirement size is in bytes
 * @return void*  Success: the first address of the currently allocable block
 *                Failure: NULL. Generally, it is only NULL when running out memory 
 */
static void *allocate_from_heap(size_t size)
{
    void *cur_bp = NULL;
    size_t extend_size = MAX(size, CHUNKSIZE);
    if ((cur_bp = extend_heap(extend_size / WSIZE)) == NULL)
    {
        return NULL;
    }

    // Insert into free list
    insert_to_free_list(cur_bp);
    return cur_bp;
}

extend_heap

/**
 * @brief Extend current heap
 * 
 * @param words  Words to be extended, in words
 * @return void* The payload first address of the currently available block
 */
static void *extend_heap(size_t words)
{
    char *bp;
    size_t size;

    /* Allocate an even number of words to maintain alignment */
    size = (words % 2) ? (words + 1) * WSIZE : words * WSIZE;
    if ((long)(bp = mem_sbrk(size)) == -1)
    {
        return NULL;
    }
    bp = (char *)(bp) + 3 * WSIZE; // point to payload
    // set this block information
    PUT(HDRP(bp), PACK(size, 0));
    PUT(FTRP(bp), PACK(size, 0));

    return bp;
}

insert_to_free_list

/**
 * @brief Insert free block into free list
 * 
 * @param bp free block The first address of the payload
 */
static void insert_to_free_list(void *bp)
{
    void *head = free_list_head;
    void *p = NEXT_FREE_BLKP(head); // Current first valid node or NULL

    if (p == NULL)
    {
        PUT(NEXT_PTR(head), bp);
        PUT(NEXT_PTR(bp), NULL);
        PUT(PREV_PTR(bp), head);
    }
    else
    {
        // Update the current node to insert
        PUT(NEXT_PTR(bp), p);
        PUT(PREV_PTR(bp), head);
        // Update head
        PUT(NEXT_PTR(head), bp);
        // Update p node (original valid node)
        PUT(PREV_PTR(p), bp);
    }
}

Using LIFO strategy, insert the block pointed by bp into free_list

mm_malloc

The comments describe the work of this function. First from free_ See if there is a suitable block in the list, otherwise it will be allocated from the heap

/* 
 * mm_malloc, Returns a pointer according to size, which points to the payload first address of the block
 * Main work:
 * 1. size round operation to meet the minimum block requirements and alignment constraints
 * 2. First, check whether any of the current free list can satisfy asize(adjusted size). If yes, place, (place may need split). If not, step 3
 * 3. Allocate a new free block from the current heap, insert it into the free list, and then place
 * 
 */
void *mm_malloc(size_t size)
{
    size_t asize;      // Adjusted block size
    char *bp;

    if (size == 0)
        return NULL;

    // step1: round size meets the minimum block and alignment constraints
    asize = ALIGN(2 * DSIZE + size); // 2*DSIZE = header+ footer + next + prev

    // Step 2: find free block from free list
    if ((bp = find_fit(asize)) != NULL)
    {
        place(bp, asize);
    }
    else
    {  //Not found in free list
        // step3: allocate from the current heap
        if ((bp = allocate_from_heap(asize)) == NULL)
        {
            return NULL;
        }
        place(bp, asize);
    }

#ifdef DEBUG
    printf("malloc\n");
    debug();
#endif
    return bp;
}

find_fit

Find the first free block that meets the requirement size from the free list and return the payload first address of the block

/**
 * @brief Use first fit policy
 * 
 * @param size 
 * @return void* If successful, the first address of the available block is returned
 *               Failed, NULL returned
 */
static void *find_fit(size_t size)
{
    void *bp;

    for (bp = NEXT_FREE_BLKP(free_list_head); bp != NULL && GET_SIZE(HDRP(bp)) > 0; bp = NEXT_FREE_BLKP(bp))
    {
        if (GET_ALLOC(HDRP(bp)) == 0 && size <= GET_SIZE(HDRP(bp)))
        {
            return bp;
        }
    }
    return NULL;
}

place

This function is implemented to split the free block pointed to by bp when possible Specifically, split is performed when the size of the current free block > = request size + minimum block

/**
 * @brief place block
 * 
 * @param bp 
 * @param size 
 */
static void place(void *bp, size_t size)
{
    size_t origin_size;
    size_t remain_size;

    origin_size = GET_SIZE(HDRP(bp));
    remain_size = origin_size - size;
    if (remain_size >= MIN_BLOCK)
    {
        // Separable
        // Set the size and allocate of the remaining blocks after splitting
        char *remain_blockp = (char *)(bp) + size;
        PUT(HDRP(remain_blockp), PACK(remain_size, 0));
        PUT(FTRP(remain_blockp), PACK(remain_size, 0));
        // Update the pointer and add the remaining blocks to the free list
        char *prev_blockp = PREV_FREE_BLKP(bp);
        char *next_blockp = NEXT_FREE_BLKP(bp);
        PUT(NEXT_PTR(remain_blockp), next_blockp);
        PUT(PREV_PTR(remain_blockp), prev_blockp);
        PUT(NEXT_PTR(prev_blockp), remain_blockp);
        if (next_blockp != NULL)
        {
            PUT(PREV_PTR(next_blockp), remain_blockp);
        }

        // Set allocated blocks
        PUT(HDRP(bp), PACK(size, 1));
        PUT(FTRP(bp), PACK(size, 1));
        // Disconnect the original block from the free list
        PUT(NEXT_PTR(bp), NULL);
        PUT(PREV_PTR(bp), NULL);
    }
    else
    {
        // Inseparable
        // Update header and footer
        PUT(HDRP(bp), PACK(origin_size, 1));
        PUT(FTRP(bp), PACK(origin_size, 1));
        // Remove free block from free list
        delete_from_free_list(bp);
    }
}

delete_from_free_list

/**
 * @brief Delete the bp node from the free list
 * 
 * @param bp 
 */
static void delete_from_free_list(void *bp)
{
    void *prev_free_block = PREV_FREE_BLKP(bp);
    void *next_free_block = NEXT_FREE_BLKP(bp);

    if (next_free_block == NULL)
    {
        PUT(NEXT_PTR(prev_free_block), NULL);
    }
    else
    {
        PUT(NEXT_PTR(prev_free_block), next_free_block);
        PUT(PREV_PTR(next_free_block), prev_free_block);
        // Disconnect
        PUT(NEXT_PTR(bp), NULL);
        PUT(PREV_PTR(bp), NULL);
    }
}

mm_free

The free function here is consistent with the free function of implicit list, focusing on the coalesce function

/*
 * mm_free - Freeing a block does nothing.
 */
void mm_free(void *ptr)
{
    size_t size = GET_SIZE(HDRP(ptr));

    PUT(HDRP(ptr), PACK(size, 0));
    PUT(FTRP(ptr), PACK(size, 0));
    coalesce(ptr);

#ifdef DEBUG
    printf("free\n");
    debug();
#endif
}

coalesce

coalesce is the focus of each allocator. You need to consider how to merge the relationship between adjacent blocks in the virtual address space. Like implicit, explicit also has four cases:

/**
 * @brief Merge the address space and insert the available free block into the free list
 * 
 * @param bp payload first address of the current block
 *           
 */
static void coalesce(void *bp)
{

    char *prev_blockp = PREV_BLKP(bp);
    char *next_blockp = NEXT_BLKP(bp);
    char *mem_max_addr = (char *)mem_heap_hi() + 1; // Upper boundary of heap
    size_t prev_alloc = GET_ALLOC(HDRP(prev_blockp));
    size_t next_alloc;

    if (next_blockp >= mem_max_addr)
    { // next_block exceeds the upper boundary of heap, and only prev is considered_ blockp
        if (!prev_alloc)
        {
            case3(bp);
        }
        else
        {
            case1(bp);
        }
    }
    else
    {
        next_alloc = GET_ALLOC(HDRP(next_blockp));
        if (prev_alloc && next_alloc)
        { // case 1: both the front and back have been allocated
            case1(bp);
        }
        else if (!prev_alloc && next_alloc)
        { //case 3: not allocated before, allocated after
            case3(bp);
        }
        else if (prev_alloc && !next_alloc)
        { // case 2: pre allocated, post unassigned
            case2(bp);
        }
        else
        { // case 4: not assigned before and after
            case4(bp);
        }
    }
}

case1

/**
 * @brief Both front and rear distribution
 * 
 * @param bp 
 * @return void* 
 */
static void *case1(void *bp)
{
    insert_to_free_list(bp);
    return bp;
}

case2

/**
 * @brief Not allocated after pre allocation
 * 
 * @param bp 
 * @return void* 
 */
static void *case2(void *bp)
{
    void *next_blockp = NEXT_BLKP(bp);
    void *prev_free_blockp;
    void *next_free_blockp;
    size_t size = GET_SIZE(HDRP(bp)) + GET_SIZE(HDRP(next_blockp));

    // Update block size
    PUT(HDRP(bp), PACK(size, 0));
    PUT(FTRP(next_blockp), PACK(size, 0));

    // free block pointer before and after update
    prev_free_blockp = PREV_FREE_BLKP(next_blockp);
    next_free_blockp = NEXT_FREE_BLKP(next_blockp);

    // Boundary check
    if (next_free_blockp == NULL)
    {
        PUT(NEXT_PTR(prev_free_blockp), NULL);
    }
    else
    {
        PUT(NEXT_PTR(prev_free_blockp), next_free_blockp);
        PUT(PREV_PTR(next_free_blockp), prev_free_blockp);
    }

    insert_to_free_list(bp);
    return bp;
}

case3

/**
 * @brief case 3 The previous block is not allocated, and the latter block is allocated
 * 
 * @param bp The payload first address of the current block
 * @return void* Merged payload first address
 */
static void *case3(void *bp)
{
    char *prev_blockp = PREV_BLKP(bp);
    char *prev_free_blockp;
    char *next_free_blockp;
    size_t size = GET_SIZE(HDRP(bp)) + GET_SIZE(HDRP(prev_blockp));

    // Update block size
    PUT(HDRP(prev_blockp), PACK(size, 0));
    PUT(FTRP(prev_blockp), PACK(size, 0));

    // Find the front and back free blocks and update them
    next_free_blockp = NEXT_FREE_BLKP(prev_blockp);
    prev_free_blockp = PREV_FREE_BLKP(prev_blockp);

    // Boundary check
    if (next_free_blockp == NULL)
    {
        PUT(NEXT_PTR(prev_free_blockp), NULL);
    }
    else
    {
        PUT(NEXT_PTR(prev_free_blockp), next_free_blockp);
        PUT(PREV_PTR(next_free_blockp), prev_free_blockp);
    }

    // LIFO policy, inserted into the header of free list
    insert_to_free_list(prev_blockp);
    return bp;
}

case4

/**
 * @brief Not assigned before and after
 * 
 * @param bp 
 * @return void* 
 */
static void *case4(void *bp)
{
    void *prev_blockp;
    void *prev1_free_blockp;
    void *next1_free_blockp;
    void *next_blockp;
    void *prev2_free_blockp;
    void *next2_free_blockp;
    size_t size;

    prev_blockp = PREV_BLKP(bp);
    next_blockp = NEXT_BLKP(bp);

    // Update size
    size_t size1 = GET_SIZE(HDRP(prev_blockp));
    size_t size2 = GET_SIZE(HDRP(bp));
    size_t size3 = GET_SIZE(HDRP(next_blockp));
    size = size1 + size2 + size3;
    PUT(HDRP(prev_blockp), PACK(size, 0));
    PUT(FTRP(next_blockp), PACK(size, 0));
    bp = prev_blockp;

    // Update the first half of the free block pointer
    prev1_free_blockp = PREV_FREE_BLKP(prev_blockp);
    next1_free_blockp = NEXT_FREE_BLKP(prev_blockp);
    if (next1_free_blockp == NULL)
    {
        PUT(NEXT_PTR(prev1_free_blockp), NULL);
    }
    else
    {
        PUT(NEXT_PTR(prev1_free_blockp), next1_free_blockp);
        PUT(PREV_PTR(next1_free_blockp), prev1_free_blockp);
    }

    // Update the free block pointer in the second half
    prev2_free_blockp = PREV_FREE_BLKP(next_blockp);
    next2_free_blockp = NEXT_FREE_BLKP(next_blockp);
    if (next2_free_blockp == NULL)
    {
        PUT(NEXT_PTR(prev2_free_blockp), NULL);
    }
    else
    {
        PUT(NEXT_PTR(prev2_free_blockp), next2_free_blockp);
        PUT(PREV_PTR(next2_free_blockp), prev2_free_blockp);
    }

    // Insert free list according to LIFO policy
    insert_to_free_list(bp);
    return bp;
}

Other debug functions

static void debug()
{
    print_allocated_info();
    print_free_blocks_info();
    consistent_check();
}
/**
 * @brief  Print allocation
 */

static void print_allocated_info()
{
    char *bp;
    size_t idx = 0;
    char *mem_max_addr = mem_heap_hi();

    printf("=============start allocated info===========\n");
    for (bp = NEXT_BLKP(free_list_head); bp < mem_max_addr && GET_SIZE(HDRP(bp)) > 0; bp = NEXT_BLKP(bp))
    {
        if (GET_ALLOC(HDRP(bp)) == 1)
        {
            ++idx;
            printf("block%d range %p  %p size=%d, payload %p  %p block size=%d\n", idx, HDRP(bp), FTRP(bp) + WSIZE, FTRP(bp) - HDRP(bp) + WSIZE, (char *)bp, FTRP(bp), FTRP(bp) - (char *)(bp));
        }
    }
    printf("=============end allocated info===========\n\n");
}

static void consistent_check()
{
    // Check that all block s in the free list are free
    char *bp;
    char *mem_max_heap = mem_heap_hi();

    for (bp = NEXT_FREE_BLKP(free_list_head); bp != NULL; bp = NEXT_FREE_BLKP(bp))
    {
        if (GET_ALLOC(HDRP(bp)))
        {
            printf("%d free list Block already allocated in\n", __LINE__);
        }
    }

    // Check that all free block s are in the free list
    for (bp = NEXT_BLKP(free_list_head); bp <= mem_max_heap; bp = NEXT_BLKP(bp))
    {
        if (!GET_ALLOC(HDRP(bp)) && !is_in_free_list(bp))
        {
            printf("%d existence free block %p be not in free list in\n", __LINE__, bp);
        }
    }
}

static int is_in_free_list(void *bp)
{
    void *p;
    for (p = NEXT_FREE_BLKP(free_list_head); p != NULL; p = NEXT_FREE_BLKP(p))
    {
        if (p == bp)
        {
            return 1;
        }
    }
    return 0;
}

The above is the implementation of the entire explicit list. The final effect is that after running all trace file s, the score is 83 / 100 It's a good level. To achieve excellence, consider making a aggregated list or a red block tree list

Let's talk about some optimization:

  1. Space optimization: for the allocated block, the NEXT and PREV pointers can not be stored, so as to expand the payload space and improve the space utilization
  2. Encapsulate the entire free list, and then use the aggregated list instead
  3. Now the search strategy is from the beginning to the end, which is relatively slow. You can establish an index for each free block, and select rbtree as the index data structure, which should greatly improve the allocation throughput

5. Summary

Well, personally, I think this lab is next to cachelab, but its difficulty is not thinking, but how to debug. After all, chicken like me, no debug is impossible. This life can not be debug, but this time lab has many macros, it is very difficult to debug in gdb, gdb can only view the continuous address memory space through exam command. However, when the malloc size given in the trace file is too large, the exam command is also difficult to view quickly. Therefore, when individuals are doing this, manually reduce the malloc size of the trace file (of course, it is changed back later), and then debug will be relatively easier

Original link

https://zhuanlan.zhihu.com/p/126341872

https://www.ravenxrz.ink/archives/36920455.html

Topics: Embedded system Operating System csapp