2021SC@SDUSC Hbase project source code analysis - flush

Posted by Ruski on Fri, 15 Oct 2021 20:35:07 +0200

2021SC@SDUSC

catalogue

1, Introduction

2, Source code analysis

1, Introduction

HBase is a distributed database based on LSM model (log structured merge trees). Compared with the B + tree used in Oracle ordinary index, a major feature of LSM model is that it adopts a balance between reading and writing, sacrificing the performance of some read data to greatly improve the performance of writing data. Therefore, HBase can write data so fast (it returns immediately after writing data into memory and log files).

However, it is inappropriate to store data in memory and log. Memory is very limited, scarce and important. Continuous writing will cause memory overflow, and log writing is only a protective measure taken because the memory data system goes down or disappears immediately after the process exits, not as a means of final data persistence, In addition, when writing to the log, it is only a simple addition, which will greatly reduce the efficiency of reading data.

The flush of MemStore is an effective measure to solve the above problems.

Follow the previous chapter< 2021SC@SDUSC After the HBase (II) project source code analysis - flush, this time analyze the core method of MemStore flush on hregon, internalFlushcache().

2, Source code analysis

First, you need to judge whether the services related to the RegionServer on the next hregon are normal

if (this.rsServices != null && this.rsServices.isAborted()) {
    throw new IOException("Aborting flush because server is aborted...");
}

Get start time

final long startTime = EnvironmentEdgeManager.currentTime();

If there is no cache to refresh, it will be returned, but we need to safely update the sequence id of the Region;

    if (this.memstoreSize.get() <= 0) {
      // Take an update lock because am about to change the sequence id and we want the sequence id
      // to be at the border of the empty memstore.

      MultiVersionConsistencyControl.WriteEntry w = null;
      
      this.updatesLock.writeLock().lock();
      try {
        if (this.memstoreSize.get() <= 0) {
          // Presume that if there are still no edits in the memstore, then there are no edits for
          // this region out in the WAL subsystem so no need to do any trickery clearing out
          // edits in the WAL system. Up the sequence number so the resulting flush id is for
          // sure just beyond the last appended region edit (useful as a marker when bulk loading,
          // etc.)
          // wal can be null replaying edits.
          if (wal != null) {
            w = mvcc.beginMemstoreInsert();
            long flushSeqId = getNextSequenceId(wal);
            FlushResult flushResult = new FlushResult(
                FlushResult.Result.CANNOT_FLUSH_MEMSTORE_EMPTY, flushSeqId, "Nothing to flush");
            w.setWriteNumber(flushSeqId);
            mvcc.waitForPreviousTransactionsComplete(w);
            w = null;
            return flushResult;
          } else {
            return new FlushResult(FlushResult.Result.CANNOT_FLUSH_MEMSTORE_EMPTY,
                "Nothing to flush");
          }
        }
      } finally {
        this.updatesLock.writeLock().unlock();
        if (w != null) {
          mvcc.advanceMemstore(w);
        }
      }
    }

Setting the status of the status Tracker: Obtaining lock to block concurrent updates;

status.setStatus("Obtaining lock to block concurrent updates");

Obtain the write lock of updatesLock and block all data update operations on the Region

this.updatesLock.writeLock().lock();

Set the status of the status Tracker: preparing to flush by snapshot stores in;

status.setStatus("Preparing to flush by snapshotting stores in " +
      getRegionInfo().getEncodedName());

Create two cache containers: storeFlushCtxs list and committedFiles mapping set, which are used to store the refresh context and completed file path in the refresh process, create the refresh serial number ID, that is, flushseqd, and initialize to - 1;

List<StoreFlushContext> storeFlushCtxs = new ArrayList<StoreFlushContext>(stores.size());
TreeMap<byte[], List<Path>> committedFiles = new TreeMap<byte[], List<Path>>(
        Bytes.BYTES_COMPARATOR);
long flushSeqId = -1L;

mvcc advances a write operation transaction. At this time, the write sequence number in w is 0, and the write entry in the multi version consistency controller is obtained;

w = mvcc.beginMemstoreInsert();

Get the refresh serial number ID. if wal is not empty, get the next serial number through wal. Otherwise, set it to - 1:

if (wal != null) {// If wal is not empty
    if (!wal.startCacheFlush(this.getRegionInfo().getEncodedNameAsBytes())) {
       String msg = "Flush will not be started for ["
                + this.getRegionInfo().getEncodedName() + "] - because the WAL is closing.";
       status.setStatus(msg);
       return new FlushResult(FlushResult.Result.CANNOT_FLUSH, msg);
    }
    flushSeqId = getNextSequenceId(wal);
} else {
    flushSeqId = myseqid;
}

Cycle all stores in the Region and preprocess storeFlushCtxs and committedFiles:

for (Store s : stores.values()) {
          totalFlushableSize += s.getFlushableSize();
          storeFlushCtxs.add(s.createFlushContext(flushSeqId));
          committedFiles.put(s.getFamily().getName(), null);
        }

Write a refresh start tag in WAL and get a transaction ID--trxId

if (wal != null) {
          FlushDescriptor desc = ProtobufUtil.toFlushDescriptor(FlushAction.START_FLUSH,
            getRegionInfo(), flushSeqId, committedFiles);
          trxId = WALUtil.writeFlushMarker(wal, this.htableDescriptor, getRegionInfo(),
            desc, sequenceId, false); // no sync. Sync is below where we do not hold the updates lock
        }

Cycle storeFlushCtxs to prepare for each StoreFlushContext, mainly to generate a snapshot of the memstore

for (StoreFlushContext flush : storeFlushCtxs) {
          flush.prepare();
        }

After the snapshot is created, release the write lock updatesLock;

this.updatesLock.writeLock().unlock();

Set the status of the status Tracker: the snapshot creation of the memstore is completed;

String s = "Finished memstore snapshotting " + this +
        ", syncing WAL and waiting on mvcc, flushsize=" + totalFlushableSize;
      status.setStatus(s);

Set a multiple version conformance controller's write sequence number, which is the serial number of this flush, then call the multi version controller, wait for other transactions to complete, and set w to null to prevent mvcc.advanceMemstore from being called again in finally module.

w.setWriteNumber(flushSeqId);
mvcc.waitForPreviousTransactionsComplete(w);
w = null;

Set the status of the status Tracker: refreshing stores in progress;

s = "Flushing stores of " + this;
      status.setStatus(s);

In case of failure, mark the current w as completed;

if (w != null) {
        mvcc.advanceMemstore(w);
      }

Cycle storeFlushCtxs, refresh each StoreFlushContext, flush cache, and write the data to the file

for (StoreFlushContext flush : storeFlushCtxs) {
        flush.flushCache(status);
}

Cycle storeFlushCtxs, and execute the commit operation for each StoreFlushContext;

for (StoreFlushContext flush : storeFlushCtxs) {
        boolean needsCompaction = flush.commit(status);
        if (needsCompaction) {
          compactionRequested = true;
        }
        committedFiles.put(it.next().getFamily().getName(), flush.getCommittedFiles());
}

Set the size of the memstore after flush, minus totalFlushableSize;

this.addAndGetGlobalMemstoreSize(-totalFlushableSize);

Write the flush flag to the WAL and execute sync at the same time;

if (wal != null) {
        FlushDescriptor desc = ProtobufUtil.toFlushDescriptor(FlushAction.COMMIT_FLUSH,
          getRegionInfo(), flushSeqId, committedFiles);
        WALUtil.writeFlushMarker(wal, this.htableDescriptor, getRegionInfo(),
          desc, sequenceId, true);
      }

Call the completeCacheFlush() method of WAL to complete the flush of MemStore: delete the latest serialization ID corresponding to the Region from the data structure lowestflushing regionsequenceids, and call closecarrier. Endop() to terminate an operation;

if (wal != null) {
      wal.completeCacheFlush(this.getRegionInfo().getEncodedNameAsBytes());
    }

Record the current time as the last flush time, assign the current flush serial number ID to lastflushseqd, and finally wake up the thread waiting for memstore.

Set status tracking status: complete and return flush results.

    this.lastFlushTime = EnvironmentEdgeManager.currentTime();
    this.lastFlushSeqId = flushSeqId;
    synchronized (this) {
      notifyAll(); // FindBugs NN_NAKED_NOTIFY
    }
 
    long time = EnvironmentEdgeManager.currentTime() - startTime;
    long memstoresize = this.memstoreSize.get();
    String msg = "Finished memstore flush of ~" +
      StringUtils.byteDesc(totalFlushableSize) + "/" + totalFlushableSize +
      ", currentsize=" +
      StringUtils.byteDesc(memstoresize) + "/" + memstoresize +
      " for region " + this + " in " + time + "ms, sequenceid=" + flushSeqId +
      ", compaction requested=" + compactionRequested +
      ((wal == null)? "; wal=null": "");
    LOG.info(msg);

    status.setStatus(msg);

    return new FlushResult(compactionRequested ? FlushResult.Result.FLUSHED_COMPACTION_NEEDED :
        FlushResult.Result.FLUSHED_NO_COMPACTION_NEEDED, flushSeqId);

If there is any mistake, please correct it

Topics: HBase