Implementation of Mit 6.824 Lab3 KV Raft

Posted by gazoo on Sat, 05 Mar 2022 10:53:18 +0100

paper address: http://nil.csail.mit.edu/6.824/2021/labs/lab-kvraft.html

preface

Before implementing Lab3, it is suggested to implement Lab3 in combination with the implementation of Lab2 and the Raft paper, that is, to build a fault-tolerant key / value storage service based on the Raft library implemented by Lab2.

start

Overall architecture

  1. Firstly, combined with the information given in Lab3's paper Architecture diagram To understand this architecture, a popular version of personal understanding will be given below to help understand.

  2. Secondly, if you have read the paper and Raft papers, you should know a key point: each KVServer (Raft serverid) corresponds to the State Machine in the paper architecture diagram, that is, the State Machine, and each KVServer corresponds to the Raft peer implemented by Lab2. In addition, kvservers use Raft Service to achieve consensus and do not interact directly.

  3. According to the description of Lab3 requirements in the paper, it is clear that KVServer can know which Client the Client's request comes from through the ClientId, and save the request information and status of each Client. Therefore, when each Client requests, it is given a unique ID just generated, and the same request corresponds to a unique serial number (ClientId), These two IDS can determine the uniqueness request. These are in Client Go and server Go has specific codes and notes.

  4. The unique ID of the Client is randomly generated by nrand(). After testing, there are up to 7 Client IDs and they will not be repeated. Each Client maintains a lastRequestId, which is generated by mathrand(len(KVServer)), representing the Seq serial number clientId of each request.

  5. KVServer maintains the lastRequestId so that when the Client makes concurrent calls, it can get the latest results through the latest RequestId to ensure the strong consistency of the application. This strong consistency realizes the strong consistency of distributed data for a period of time (500ms) through the timer.

Request and response process:

Request response process, take Put/Get as an example:

  1. After receiving the Request from the Client, KVServer uses Raft Start() submits the Op to the Raft library, and then waits for the result returned by the Raft to waitaplych through the Chan mechanism, that is, after the Raft application log is sent to the state machine, it responds to the KVServer by putting response data into the Chan buffer.
  2. After all peer s of Raft apply the currently requested command Op, each Server will wait for the incoming Op in ApplyCh in a separate thread ApplyLoop until the ApplyCh buffer gets the response of Raft.
  3. The Raft library executes this op (all get is executed, and duplicate put and append are not executed)
  4. The Leader waits for the application loop to complete, and then returns the execution result in the Raft library to the Wait Channel according to the information in the Op. only then can the Wait Channel wait for the result
  5. Finally, the Leader encapsulates the execution results and returns them to the Client

Specific request process

  1. go kv.ReadRaftApplyCommandLoop() in this Loop, listen and read the applyCh of KVServer
  2. Through applych chan raft in KVServer Applymsg implements write blocking with the help of the mechanism of pipeline chan, that is, in the process of request response, node peers listen to a chan pipeline. After the pipeline receives it, the next operation can be triggered.
  3. The above conditions can be triggered only when the leader is responsible for accepting the client's request. The follower does not actively trigger, but can only wait for the leader to synchronize
  4. After receiving the request, the Leader determines whether the condition is accurate. If it is correct, it will be handed over to the Raft#Start(Op) method. Next, the callback of the waiting method will be blocked and the result will be returned.
  5. Set WaitChan waiting result and Timeout as needed to determine whether the response times out
  6. After that, Raft will synchronize the Op execution results of the Leader to all followers, and the ApplyEntry will be synchronized to each KVServer in the upper layer
  7. Determine whether the response can be received within the time of strong consistency according to whether the return time of the pipeline times out
    1. If timeout:
      1. ifRequestDuplicate() will be performed first to determine whether the RequestId is out of date. If it is still the latest RequestId, the Op command will be executed from the Leader's state machine and the results obtained after the command is executed in the local log will be returned
    2. No timeout means that within the effective time of consistency, you only need to judge whether the clientId and RequestId of the Raft response are the same, that is, whether it is the latest request. If yes, it indicates that there is the latest data of Op.key in the KVDB of KVServer, ensuring strong data consistency.
  8. After executing Get or Append, delete the Op corresponding to the raftIndex of the pipeline

Request blocking problem

Through time. In KVServer After implements blocking timeout and retransmission.

Because neither waitChan nor the Call method in labrpc has the concept of "callback timeout", where will it be blocked.

Therefore, it is necessary to implement the timer timeout mechanism on the Server side (or client side) to avoid infinite blocking.

Duplicate request question

  • One of the core functions of Lab3A is to deal with the problem of Duplicate Request, which ensures that a Duplicate Request will not be executed twice on the same state machine, and each request corresponds to a unique ClientID:RequestID
  • For the request received by KVServer from the Client side, whether repeated or not, we submit it to Raft as its log KVServer passes kV Ifrequestduplicate method is responsible for determining whether the Request Op referred to by this log is repeated when accepting the apply log. If it is repeated, we will not execute it on the state machine and directly return OK. We only need to consider this problem in the Put method, and the reading operation will not affect the data of the state machine.

Routing load balancing problem

Clinet . Serial number of servers [] and kvservers Me is not one-to-one, but randomly shuffle d. But KVServer Me and raft Me is corresponding. This causes the Client to send a request to a KVServer that is not a Leader, and the KVServer can get the raft The serial number of the Leader is returned to the Client, but the Client cannot pass through the Client Servers [raft.leader] to find the real Leader, or random access to another one. So when we receive ErrWrongLeader, we just need to randomly access the next KVServer

ck.RecentLeaderId = GetRandomServer(len(ck.Servers))
server := ck.RecentLeaderId
for{
   // RPC requests KVServer's Get method, and returns leaderId if successful
    ok := ck.Servers[server].Call("KVServer.Get", &args, &reply)
    // Change to the next Server and try again until OK or Error
    if !ok || reply.Err == ErrWrongLeader {
    // LeaderId
    server = (server + 1) % len(ck.Servers)
    continue
	...
}

snapshot

Snapshot snapshot is actually the KeyValue database maintained by the Server, which can be regarded as a map in memory

For leaders:

// Response to the command to cycle through the log entries that Raft has applied
func (kv *KVServer) ReadRaftApplyCommandLoop() {
	for message := range kv.applyCh {
		if message.CommandValid {
			kv.GetCommandFromRaft(message)
		}
		if message.SnapshotValid {
			kv.GetSnapshotFromRaft(message)
		}
	}
}
  • After the leader applies the log, message Commandvalid is true, indicating that RaftState [] is incrementing, that is, a log is applied and the command is valid.

  • Then the Leader will execute the Get or Put operation of the response. After completion, judge whether to command Raft to compress the Snapshot according to the log entry threshold maxraftstate and the current number of log entries RaftStateSize.

  • If necessary, call the MakeSnapshot method to make its own KVDB, RequestID and other information into a Snapshot, and call the Snapshot interface of the Raft library.

  • The Leader installs the SnapShot, which is divided into three parts. Trim the log entries [], and the SnapShot is persisted and stored through the Persister. Then, send the SnapShot information to the backward followers in the appendix entries

  • Finally, the execution result is returned to WaitChannel

For Follower:

ยท		if message.SnapshotValid {
			kv.GetSnapshotFromRaft(message)
		}
  • 1 . After the leader executes the RPC method of InstallSnapshot, the Raft layer will obtain the snapshot data, cut the log, and report it to the Server through ApplyCh (at this time, SnapshotValid: true)
  • 2 . Follower's Applyloop receives the request and calls CondInstallSnapshot() to ask if the snapshot can be installed

    // Get snapshot log from Raft
    func (kv *KVServer) GetSnapshotFromRaft(message raft.ApplyMsg) {
    	kv.mu.Lock()
    	defer kv.mu.Unlock()
    	if kv.rf.CondInstallSnapshot(message.SnapshotTerm, message.SnapshotIndex, message.Snapshot) {
    		// Append snapshot log
    		kv.ReadSnapshotToInstall(message.Snapshot)
    		kv.lastSSPointRaftLogIndex = message.SnapshotIndex
    	}
    }
    
  • 3 . CondInstallSnapshot() determines the snapshot installation conditions, persists the snapshot, and notifies the Server that the snapshot can be installed

Core code

KVServer data structure
type KVServer struct {
   mu sync.Mutex
   me int
   // Each KVServer corresponds to a Raft
   rf      *raft.Raft
   applyCh chan raft.ApplyMsg
   dead    int32 // set by Kill()

   // In the snapshot log, the State of the last log entry
   maxraftstate int // snapshot if log grows this big
   // Your definitions here.
   // Save the data of put, key: value
   kvDB map[string]string
   // index(Raft pper) -> chan
   waitApplyCh map[int]chan Op
   // clientId : requestId
   lastRequestId map[int64]int

   // last Snapshot point & raftIndex
   lastSSPointRaftLogIndex int
}

Start KVServer

// Start KVServer
func StartKVServer(servers []*labrpc.ClientEnd, me int, persister *raft.Persister, maxraftstate int) *KVServer {
	// call labgob.Register on structures you want
	// Go's RPC library to marshall/unmarshall.
	DPrintf("[InitKVServer---]Server %d", me)
	// Register rpc server
	labgob.Register(Op{})

	kv := new(KVServer)
	kv.me = me
	kv.maxraftstate = maxraftstate

	// You may need initialization code here.

	kv.applyCh = make(chan raft.ApplyMsg)
	kv.rf = raft.Make(servers, me, persister, kv.applyCh)

	// You may need initialization code here.
	// kv initialization
	kv.kvDB = make(map[string]string)
	kv.waitApplyCh = make(map[int]chan Op)
	kv.lastRequestId = make(map[int64]int)

	// snapshot
	snapshot := persister.ReadSnapshot()
	if len(snapshot) > 0 {
		// Read snapshot log
		kv.ReadSnapshotToInstall(snapshot)
	}
	// Circular reading of log entries applied by Raft command
	go kv.ReadRaftApplyCommandLoop()
	return kv
}

Put and Get

// RPC method
func (kv *KVServer) Get(args *GetArgs, reply *GetReply) {
	// Your code here.
	if kv.killed() {
		reply.Err = ErrWrongLeader
		return
	}

	_, ifLeader := kv.rf.GetState()
	// RaftServer must be a Leader
	if !ifLeader {
		reply.Err = ErrWrongLeader
		return
	}

	op := Op{
		Operation: "get",
		Key:       args.Key,
		Value:     "",
		ClientId:  args.ClientId,
		RequestId: args.RequestId,
	}

	// Send command to Raft server
	raftIndex, _, _ := kv.rf.Start(op)
	DPrintf("[GET StartToRaft]From Client %d (Request %d) To Server %d, key %v, raftIndex %d", args.ClientId, args.RequestId, kv.me, op.Key, raftIndex)

	// waitForCh
	kv.mu.Lock()
	// chForRaftIndex is the chan where the Op is saved, and raftIndex is LastLogIndex+1 of the Raft Server
	// Used to implement RPC call raft When starting, save the Op returned by RPC and obtain it through lastLogIndex of Raft Server
	// Through the lastLogIndex of raft, the saved value of the log entry can be obtained and saved to KVDB
	chForRaftIndex, exist := kv.waitApplyCh[raftIndex]
	// Loop Apply requires linearization technically
	// If the record does not exist, it indicates that the call has not returned the result, then continue to wait for the call to return
	if !exist {
		kv.waitApplyCh[raftIndex] = make(chan Op, 1)
		chForRaftIndex = kv.waitApplyCh[raftIndex]
	}
	// RPC call complete
	kv.mu.Unlock()

	// Timeout
	select {
	// If the time of getting the consistency from the request VDB exceeds the request vdid, it is necessary to obtain the consistency from the request VDB
	case <-time.After(time.Millisecond * CONSENSUS_TIMEOUT):
		DPrintf("[GET TIMEOUT!!!]From Client %d (Request %d) To Server %d, key %v, raftIndex %d", args.ClientId, args.RequestId, kv.me, op.Key, raftIndex)
		_, ifLeader := kv.rf.GetState()

		// Whether the latest RequestId of the client is newRequestId; if not, the latest RequestId will be returned
		// This step ensures that when the client calls KVServer concurrently, the latest result is obtained according to the latest RequestId
		if kv.ifRequestDuplicate(op.ClientId, op.RequestId) && ifLeader {
			// Get the latest RequestId of the client according to the command, get and save the value of KVDB
			value, exist := kv.ExecuteGetOpOnKVDB(op)
			if exist {
				reply.Err = OK
				reply.Value = value
			} else {
				reply.Err = ErrNoKey
				reply.Value = ""
			}
		} else {
			reply.Err = ErrWrongLeader
		}

	// Within the effective time of consistency:
	case raftCommitOp := <-chForRaftIndex:
		DPrintf("[WaitChanGetRaftApplyMessage<--]Server %d , get Command <-- Index:%d , ClientId %d, RequestId %d, Opreation %v, Key :%v, Value :%v", kv.me, raftIndex, op.ClientId, op.RequestId, op.Operation, op.Key, op.Value)
		// The RPC request submitted to Raft is the Op command of this time
		if raftCommitOp.ClientId == op.ClientId &&
			raftCommitOp.RequestId == op.RequestId {
			// Get kvsserver's kvsserver directly
			value, exist := kv.ExecuteGetOpOnKVDB(op)
			if exist {
				reply.Err = OK
				reply.Value = value
			} else {
				reply.Err = ErrNoKey
				reply.Value = ""
			}
		} else {
			reply.Err = ErrWrongLeader
		}
	}

	kv.mu.Lock()
	// After Get, delete the Op corresponding to raftIndex in chan map
	delete(kv.waitApplyCh, raftIndex)
	kv.mu.Unlock()
	return
}

Put method

// RPC method
func (kv *KVServer) PutAppend(args *PutAppendArgs, reply *PutAppendReply) {
	// Your code here.
	if kv.killed() {
		reply.Err = ErrWrongLeader
		return
	}

	_, ifLeader := kv.rf.GetState()
	// RaftServer must be a Leader
	if !ifLeader {
		reply.Err = ErrWrongLeader
		return
	}

	op := Op{
		Operation: args.Op,
		Key:       args.Key,
		Value:     args.Value,
		ClientId:  args.ClientId,
		RequestId: args.RequestId,
	}

	// Send command to Raft server
	raftIndex, _, _ := kv.rf.Start(op)
	DPrintf("[PUTAPPEND StartToRaft]From Client %d (Request %d) To Server %d, key %v, raftIndex %d", args.ClientId, args.RequestId, kv.me, op.Key, raftIndex)

	// waitForCh
	kv.mu.Lock()
	// chForRaftIndex is the chan where the Op is saved, and raftIndex is LastLogIndex+1 of the Raft Server
	// Used to implement RPC call raft When starting, save the Op returned by RPC and obtain it through lastLogIndex of Raft Server
	// Through the lastLogIndex of raft, the saved value of the log entry can be obtained and saved to KVDB
	chForRaftIndex, exist := kv.waitApplyCh[raftIndex]
	// Loop Apply requires linearization technically
	// If the record does not exist, it indicates that the call has not returned the result, then continue to wait for the call to return
	if !exist {
		kv.waitApplyCh[raftIndex] = make(chan Op, 1)
		chForRaftIndex = kv.waitApplyCh[raftIndex]
	}
	// RPC call complete
	kv.mu.Unlock()

	// Timeout
	select {
	// If the time exceeds the consistency requirements, you need to obtain the results from KVDB through lastRequestId
	case <-time.After(time.Millisecond * CONSENSUS_TIMEOUT):
		DPrintf("[TIMEOUT PUTAPPEND !!!!]Server %d , get Command <-- Index:%d , ClientId %d, RequestId %d, Opreation %v, Key :%v, Value :%v", kv.me, raftIndex, op.ClientId, op.RequestId, op.Operation, op.Key, op.Value)

		// Whether the latest RequestId of the client is newRequestId; if not, the latest RequestId will be returned
		// This step ensures that when the client calls KVServer concurrently, the latest result is obtained according to the latest RequestId
		if kv.ifRequestDuplicate(op.ClientId, op.RequestId) {
			reply.Err = OK
		} else {
			reply.Err = ErrWrongLeader
		}

	// Within the effective time of consistency:
	case raftCommitOp := <-chForRaftIndex:
		DPrintf("[WaitChanGetRaftApplyMessage<--]Server %d , get Command <-- Index:%d , ClientId %d, RequestId %d, Opreation %v, Key :%v, Value :%v", kv.me, raftIndex, op.ClientId, op.RequestId, op.Operation, op.Key, op.Value)

		// The RPC request submitted to Raft is the Op command of this time
		if raftCommitOp.ClientId == op.ClientId &&
			raftCommitOp.RequestId == op.RequestId {
			reply.Err = OK
		} else {
			reply.Err = ErrWrongLeader
		}
	}

	kv.mu.Lock()
	delete(kv.waitApplyCh, raftIndex)
	kv.mu.Unlock()
	return
}