[high frequency interview questions from large factories] explain in detail the implementation of LRU cache (the least used cache recently) - C + + version

Posted by lpxxfaintxx on Thu, 03 Mar 2022 04:51:19 +0100

1 what is LRU cache

LRU (Least Recently Used) algorithm is an elimination strategy. In short, the implementation is as follows: put some elements into a container with fixed capacity for access. Due to the limited capacity of the container, the container should ensure that the elements that have been used recently are always in the container, and eliminate the elements that have not been used for a long time, Realize the dynamic maintenance of elements in the container. This algorithm is a cache maintenance strategy, because the cache space is limited, so that the elements stored in the cache are only recently used, which can realize the efficient operation of the system cache.

2 data structure and algorithm design

2.1 requirements

Implement a class LRUCache.

Implement the construction method of the class, so that the user of the class can create an LRUCache with a specified capacity by setting the initialization capacity.
Implement the external operation interface of this kind of object - get() member function. This function is used to find out whether a specified keyword is in the LRUCache object. If yes, it returns the value of the keyword; otherwise, it returns - 1. That is, the input parameter is an int value key, and an int is returned, representing the value corresponding to the keyword key in LRUCache.
Implement the external operation interface of this kind of object - put() member function. This function is used to put a specified key value pair into the LRUCache object. This function does not need to return any value outward. That is, the input parameters are an int value key and an int value value (representing a key value pair key value). If the keyword key already exists in LRUCache, the value corresponding to the key keyword in LRUCache will be updated with the value value in the parameter list. If the keyword key does not exist in the cache, the group of key values will be inserted into the cache. If the insert operation causes the number of keywords to exceed the initial capacity of the cache, the longest unused keywords should be deleted from the cache.

2.2 design

According to the above requirements, we can determine:

LRUCache itself needs to be implemented by a specific storage container. The elements stored in the container are in the form of key value pairs, and the container searches for elements by entering keys and returning the values corresponding to the keys. Therefore, it can be determined that the LRUCache container itself can be implemented by using the data structure of hash table. And according to the parameter types of keys and values, it can be determined that the keys and values of the hash table are of type int.
The operations completed by the int get(int key) interface of LRUCache are those that can be completed by the hash table itself. At the same time, the interface defines the meaning of the elements used in the cache, that is, the key value pair element last queried through the get interface is the element last used in the cache.
The operation completed by the void put(int key, int value) interface of LRUCache implies the requirement of dynamically maintaining the elements in LRUCache. The method of dynamic maintenance is LRU algorithm, that is, the elements (int key value pairs) in LRUCache are arranged in chronological order. At the same time, it should be noted that the interface operation also defines another meaning of the elements in the cache to be used, that is, the last element put into the cache through the put interface is the last element used in the cache.

LRU algorithm can be realized through two-way linked list. The details are as follows:

The node storage content of the bidirectional linked list is the hash table element (int type key value pair) used to implement the LRUCache container
There is a limit to the two-way linked list, that is, the maximum number of nodes is the capacity of LRUCache.
Every time an element in LRUCache is queried through the int get(int key) interface, the element will be moved to the head in the two-way linked list.
Operating the void put(int key, int value) interface of LRUCache may increase the number of nodes in the two-way linked list (when the key value is not in the cache and the cache capacity is not full). Each put operation (whether there is a key, just update the value, or no key, insert a new key value pair) will move the key value element of the put operation to the head in the two-way linked list
Through the above operations, the tail node of the two-way linked list must be the element that has not been used for the longest time in LRUCache. When the two-way linked list exceeds the limited length, the super long tail node will be deleted

To realize the above two-way linked list operation, you need to define the two-way linked list node and the corresponding node movement operation. In C + +, you can customize a structure or class, and its member properties are as follows:

Key and value are LRUCache elements (key value pairs) stored in bidirectional linked list nodes
prev refers to the forward pointer of the previous linked list node
Next is the backward pointer to the next linked list node

The method of operating the bidirectional linked list is implemented in the LRUCache class (because the initial head and tail dummy nodes of the bidirectional linked list need to be defined by the initialization of LRUCache). The functions of these functions are as follows. Note that we use the common technique of adding head and tail dummy nodes to simplify the boundary problem of the linked list:

void addToHead(DLinkedNode *node) gives a new node and adds it to the head of the bidirectional linked list
Void removinode (dlinkednode * node) gives a node in a linked list and removes it
void moveToHead(DLinkedNode *node) gives a node in the linked list and moves it to the head of the two-way linked list (obviously, this method only needs to call removeNode() and then addToHead())
DLinkedNode * removeTail() removes the tail node in the bidirectional linked list and returns a pointer to the removed node

To sum up, the class design diagram of LRUCache is as follows:

3 code implementation (C + +)

Direct code (with detailed notes)

//LRUCache.cpp
#include <unordered_map>
using namespace std;

//Implement bidirectional linked list node
struct DLinkedNode {
	//Member properties
    int key, value;
    DLinkedNode *prev;
    DLinkedNode *next;
    //Constructor
    DLinkedNode() : key(0), value(0), prev(nullptr), next(nullptr) {}
    DLinkedNode(int _key, int _value) : key(_key), value(_value), prev(nullptr), next(nullptr) {}
};


class LRUCache {
 private:
    unordered_map<int, DLinkedNode *> cache;
    DLinkedNode *head;
    DLinkedNode *tail;
    int size; //Current size of LRU cache
    int capacity; //Capacity of LRU cache (initialization size)
    //Method for realizing bidirectional linked list operation
    void addToHead(DLinkedNode *node) { //Add a bidirectional list node to the head of the linked list
        //Note that the following head refers to the dummy node before the real head node
        node->prev = head;
        node->next = head->next;
        head->next->prev = node;
        head->next = node;
    }
    
    void removeNode(DLinkedNode *node) { //Remove a bidirectional linked list node from the linked list   
        node->prev->next = node->next;
        node->next->prev = node->prev;
    }
    
    void moveToHead(DLinkedNode *node) { //Move the bidirectional linked list node to the head of the linked list
        removeNode(node);
        addToHead(node);
    }
    
     DLinkedNode *removeTail() { //Remove the tail node of the two-way linked list and return its previous node (as a new tail node)
        //Note that the following tail refers to the dummy node after the real tail node
        DLinkedNode *node = tail->prev;
        removeNode(node);
        return node;
    }

 public:
    LRUCache(int _capacity) : capacity(_capacity), size(0) {
        //Use pseudo header and pseudo tail nodes (dummy nodes) to mark boundaries, so you don't need to check the existence of adjacent nodes when adding and deleting nodes
        head = new DLinkedNode();
        tail = new DLinkedNode();
        head->next = tail;
        tail->next = head;
        //Only after the initialization of head and tail dummy nodes is added, size=0, capacity=2
    }

    int get(int key) {
        if (!cache.count(key)) { //key does not exist, return - 1
            return -1;
        }
        //If the key exists, first locate it through the hash table and then move it to the head of the linked list
        DLinkedNode *node = cache[key];
        moveToHead(node);
        return node->value;
    }

    void put(int key, int value) {
        if (!cache.count(key)) { //If the key does not exist
            DLinkedNode *node = new DLinkedNode(key, value); //Create a new node
            cache[key] = node; //Add it to the hash table
            addToHead(node); //Add it to the head of the bidirectional linked list
            ++size; //size growth of LRU cache (note that capacity is initialized and unchanged)
            if (size > capacity) { //If the storage exceeds the capacity of the LRU cache
                DLinkedNode *removed = removeTail(); //Delete the tail node of the two-way linked list
                cache.erase(removed->key); //Delete corresponding entries in hash table
                delete removed; //Manually delete the removed pointer to prevent memory leakage
                --size;
            }
        } else { //If key exists
            DLinkedNode *node = cache[key]; //First locate through the hash table
            node->value = value; //Modify value
            moveToHead(node); //Move it to the head in the two-way linked list
        }
    }
    
};

Topics: C++ Algorithm data structure Back-end Cache

Programmer Think