2021SC@SDUSC Hbase project code analysis - LruBlockCache

Posted by ganlal on Thu, 25 Nov 2021 21:58:48 +0100

2021SC@SDUSC

catalogue

1, Introduction

2, Cache level

3, Implementation analysis of LruBlockCache

4, Implementation analysis of obsolete cache

1, Introduction

        If you access hfile every time you read data, the efficiency is very low, especially when reading random small amounts of data. In order to improve the performance of IO, HBase provides a caching mechanism BlockCache, and LruBlockCache is one of its schemes.

2, Cache level

        There are three cache levels defined in blockPriority:

public enum BlockPriority {
  /**
   * Accessed a single time (used for scan-resistance)
   */
  SINGLE,
  /**
   * Accessed multiple times
   */
  MULTI,
  /**
   * Block from in-memory store
   */
  MEMORY
}
  1. SINGLE: used for scan, etc. to avoid cache replacement caused by a large number of one access
  2. MULTI: multiple cache
  3. MEMORY: resident cache

3, Implementation analysis of LruBlockCache

        cacheBlock() is the implementation of this cache:

  // BlockCache implementation
 
  /**
   * Cache the block with the specified name and buffer.
   * <p>
   * It is assumed this will NOT be called on an already cached block. In rare cases (HBASE-8547)
   * this can happen, for which we compare the buffer contents.
   * @param cacheKey block's cache key
   * @param buf block buffer
   * @param inMemory if block is in-memory
   * @param cacheDataInL1
   */
  @Override
  public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory,
      final boolean cacheDataInL1) {
    if (buf.heapSize() > maxBlockSize) {
      // If there are a lot of blocks that are too
      // big this can make the logs way too noisy.
      // So we log 2%
      if (stats.failInsert() % 50 == 0) {
        LOG.warn("Trying to cache too large a block "
            + cacheKey.getHfileName() + " @ "
            + cacheKey.getOffset()
            + " is " + buf.heapSize()
            + " which is larger than " + maxBlockSize);
      }
      return;
    }

    LruCachedBlock cb = map.get(cacheKey);
    if (cb != null) {
      // compare the contents, if they are not equal, we are in big trouble
      if (compare(buf, cb.getBuffer()) != 0) {
        throw new RuntimeException("Cached block contents differ, which should not have happened."
          + "cacheKey:" + cacheKey);
      }
      String msg = "Cached an already cached block: " + cacheKey + " cb:" + cb.getCacheKey();
      msg += ". This is harmless and can happen in rare cases (see HBASE-8547)";
      LOG.warn(msg);
      return;
    }
    
    long currentSize = size.get();
   
    long currentAcceptableSize = acceptableSize();
    
    long hardLimitSize = (long) (hardCapacityLimitFactor * currentAcceptableSize);
    
    if (currentSize >= hardLimitSize) {
      stats.failInsert();
      if (LOG.isTraceEnabled()) {
        LOG.trace("LruBlockCache current size " + StringUtils.byteDesc(currentSize)
          + " has exceeded acceptable size " + StringUtils.byteDesc(currentAcceptableSize) + "  too many."
          + " the hard limit size is " + StringUtils.byteDesc(hardLimitSize) + ", failed to put cacheKey:"
          + cacheKey + " into LruBlockCache.");
      }
      if (!evictionInProgress) {
        runEviction();
      }
      return;
    }
    
    cb = new LruCachedBlock(cacheKey, buf, count.incrementAndGet(), inMemory);
    long newSize = updateSizeMetrics(cb, false);
    
    map.put(cacheKey, cb);
    
    long val = elements.incrementAndGet();
    if (LOG.isTraceEnabled()) {
      long size = map.size();
      assertCounterSanity(size, val);
    }

    if (newSize > currentAcceptableSize && !evictionInProgress) {
      runEviction();
    }
  }

        The logic of the method is as follows:

  1. Judge whether the size of the data to be cached is larger than the maximum block. If it is larger than the maximum block, log at a frequency of 2% and return
  2. Try to get the data block from the cache map according to the cacheKey to cache the data block
  3. If successful: the content is inconsistent, throw an exception, and record and return the log
  4. Obtain the current cache size and acceptable cache size, and calculate the hard limit size hardLimitSize
  5. If the current cache size exceeds the hard limit, when the recycle is not executed, execute the recycle and return. Otherwise, return directly
  6. Construct LruBlockCache instance cb
  7. Put cb into map cache
  8. The number of elements increases by one
  9. If the new cache size exceeds the acceptable size and the recycling process is not performed, memory recycling is performed

4, Implementation analysis of obsolete cache

        There are two ways to eliminate cache:

  1. Performs cache eviction in the main thread
  2. In a special knockout thread, cache knockout is performed by holding the weak reference WeakReference of the external class LruBlockCache

        In that way, it is determined by the evictionThread in the constructor:

    if(evictionThread) {
      this.evictionThread = new EvictionThread(this);
      this.evictionThread.start(); // FindBugs SC_START_IN_CTOR
    } else {
      this.evictionThread = null;
    }

        In the runEviction() method that executes the eviction cache:

  /**
   * Multi-threaded call to run the eviction process.
   */
  private void runEviction() {
    if(evictionThread == null) {
      evict();
    } else {
      evictionThread.evict();
    }
  }

        The implementation of evict() is as follows:

    public void evict() {
      synchronized(this) {
        this.notifyAll();
      }
    }

        That is, get the object lock of this thread through synchronized, and then the main thread wakes up this thread by reclaiming notifyAll of the thread object.

        Unfinished

Topics: HBase