ZooKeeper Series: Logging

Posted by jackofalltrades on Mon, 09 Sep 2019 14:17:49 +0200

Each write operation in Zookeeper maintains a transaction log dedicated to writing znode operations. Only the data changes confirmed by the transaction log will take effect throughout the cluster.

1,TxnLog

TxnLog is a write transaction log interface, which mainly includes the following interfaces:

  1. rollLog: Log scroll to start a new log
  2. append: Add write transactions to the end of the log
  3. GetLastLogged Zxid: read the zxid of the final record
  4. truncate: truncate write transactions for scenarios where Learner transactions are more than Leader transactions
  5. Commit: commit a transaction to confirm that the transaction is persistent

The rollLog method provides log scrolling function. If the transaction log file grows indefinitely, it will inevitably affect the performance. The rollLog method will restart the new transaction log file name, and subsequent transactions will be written to the new file. Many transaction log files in ZooKeeper are usually distinguished by zxid.

The Truncate method provides truncation logging and deletes all transactions after zxid. Usually, when Follower has more transaction logs than Leader, it triggers a method change to ensure that Follower and Leader's database are synchronized.

The Commit method submits files to disk to ensure that transactions are actually written to disk rather than just existing in the file memory cache.

2. Implementing FileTxnLog

The class that implements TxnLog interface in ZooKeeper is FileTxnLog. Its main functions and methods include the following.

2.1 append method

Main code:

logFileWrite = new File(logDir, ("log." + Long.toHexString(hdr.getZxid())));
fos = new FileOutputStream(logFileWrite);
logStream=new BufferedOutputStream(fos);
oa = BinaryOutputArchive.getArchive(logStream);
FileHeader fhdr = new FileHeader(TXNLOG_MAGIC,VERSION, dbId);
fhdr.serialize(oa, "fileheader");
logStream.flush();
currentSize = fos.getChannel().position();
streamsToFlush.add(fos);
padFile(fos);
Checksum crc = makeChecksumAlgorithm();
crc.update(buf, 0, buf.length);
oa.writeLong(crc.getValue(), "txnEntryCRC");
Util.writeTxnBytes(oa, buf);

This method adds transactions to the end of the log file.

Before the log is scrolled, write to the current log file; if the log is rolled back, write to the new log file, the name of the new log file is "log." plus the value of the current zxid.

2.2 rollLog method

Log scroll, close old log files, start new log files, main code:

if (logStream != null) {
    this.logStream.flush();
    this.logStream = null;
}

2.3 getLastLogged Zxid method

Get the latest zxid value from the log file and the latest transaction log file from lastLogged Zxid. In the case of multiple log files, all files are traversed, the log file with the largest zxid in the file name is selected, and the largest zxid is retrieved from the log file.

Main code:

public long getLastLoggedZxid() {
    File[] files = getLogFiles(logDir.listFiles(), 0);
    long zxid = maxLog;
    TxnIterator itr = null;
    FileTxnLog txn = new FileTxnLog(logDir);
    itr = txn.read(maxLog);
     while (true) {
        if(!itr.next())
             break;
        TxnHeader hdr = itr.getHeader();
        zxid = hdr.getZxid();
     }     
     return zxid;
 }

 

2.4 truncate method

Truncate the redundant log information to ensure that the log files are legitimate and effective.

Main code:

public boolean truncate(long zxid) throws IOException {
    FileTxnIterator itr = null;
    itr = new FileTxnIterator(this.logDir, zxid);
    PositionInputStream input = itr.inputStream;
    long pos = input.getPosition();
    RandomAccessFile raf=new RandomAccessFile(itr.logFile,"rw");
    raf.setLength(pos);
    raf.close();
    return true;
 }

 

2.5 commit method

Confirm submission log file cache, main code:

public synchronized void commit() throws IOException {
   for (FileOutputStream log : streamsToFlush) {
       log.flush();
       if (forceSync) {
          long startSyncNS = System.nanoTime();
          log.getChannel().force(false);          
      }
  }
}

Topics: Big Data Zookeeper Database