[Nacos] Data Consistency

Posted by killerofet on Sun, 16 Jun 2019 20:35:19 +0200

From: https://blog.csdn.net/liyanan21/article/details/89320872

 

Catalog

I. Raft algorithm

Part of Raft Source Code in Nacos

init()

1. Get Raft cluster nodes

NamingProxy.getServers() Gets the cluster nodes

NamingProxy.refreshSrvIfNeed() Gets node information

NamingProxy.refreshServerListFromDisk() Gets cluster node information

2. Raft Cluster Data Recovery

RaftStore.load()

3. Raft elections

GlobalExecutor.register(new MasterElection()) Register Election Timing Task

MasterElection.sendVote() Sends Timing Tasks

(1) RaftCommands.vote() handles / v1/ns/raft/vote requests

(2) PeerSet. DecieLeader () Election

4. Raft heartbeat

GlobalExecutor.register(new HeartBeat()) registers heartbeat timing tasks

HeartBeat.sendBeat() Sends Heartbeat Packets

(.) RaftCommands.beat() method handles / v1/ns/raft/beat requests

5. Raft publishes content

Registration Entry

Instance information persistence

(1)Service.put()

(2)RaftCore.signalPublish()

(3)/raft/datum interface and/raft/datum/commit interface

Publish Entry RaftCommands.publish()

6. Raft guarantees content consistency

I. Raft algorithm

Raft reached consensus through elected leaders. The servers in the raft cluster are leaders or followers, and can be candidates (leaders are not available) in the precise case of elections. Leaders are responsible for copying logs to followers. It regularly notifies followers of its existence by sending heartbeat messages. Each follower has a timeout (usually between 150 and 300 milliseconds), which expects the leader's heartbeat. Reset timeout when receiving heartbeat. If no heartbeat is received, the follower changes his status to a candidate and starts leading the election.

See: Raft algorithm

Part of Raft Source Code in Nacos

At startup, Nacos server calls the RaftCore.init() method through the RunningConfig.onApplicationEvent() method.

init()

public static void init() throws Exception {
 
    Loggers.RAFT.info("initializing Raft sub-system");
 
    // Start Notifier, poll Datums, and notify RaftListener
    executor.submit(notifier);
     
    // Get the Raft cluster node and update it to PeerSet
    peers.add(NamingProxy.getServers());
 
    long start = System.currentTimeMillis();
 
    // Data recovery by loading Datum and term data from disk
    RaftStore.load();
 
    Loggers.RAFT.info("cache loaded, peer count: {}, datum count: {}, current term: {}",
        peers.size(), datums.size(), peers.getTerm());
 
    while (true) {
        if (notifier.tasks.size() <= 0) {
            break;
        }
        Thread.sleep(1000L);
        System.out.println(notifier.tasks.size());
    }
 
    Loggers.RAFT.info("finish to load data from disk, cost: {} ms.", (System.currentTimeMillis() - start));
 
    GlobalExecutor.register(new MasterElection()); // Leader election
    GlobalExecutor.register1(new HeartBeat()); // Raft heartbeat
    GlobalExecutor.register(new AddressServerUpdater(), GlobalExecutor.ADDRESS_SERVER_UPDATE_INTERVAL_MS);
 
    if (peers.size() > 0) {
        if (lock.tryLock(INIT_LOCK_TIME_SECONDS, TimeUnit.SECONDS)) {
            initialized = true;
            lock.unlock();
        }
    } else {
        throw new Exception("peers is empty.");
    }
 
    Loggers.RAFT.info("timer started: leader timeout ms: {}, heart-beat timeout ms: {}",
        GlobalExecutor.LEADER_TIMEOUT_MS, GlobalExecutor.HEARTBEAT_INTERVAL_MS);
}

In the init method, the following main things are done:

  • 1. Get the Raft cluster node peers.add(NamingProxy.getServers());
  • 2. Raft cluster data recovery RaftStore.load();
  • 3. Raft elects Global Executor. register (new Master Election ());
  • 4. Raft heartbeat GlobalExecutor.register(new HeartBeat());
  • 5. Raft publishes content
  • 6. Raft guarantees content consistency

1. Get Raft cluster nodes

NamingProxy.getServers() Gets the cluster nodes

  • NamingProxy.refreshSrvIfNeed() Gets node information
  • Return List < String > servers

NamingProxy.refreshSrvIfNeed() Gets node information

  • If stand-alone mode

    The ip:port of the host is Raft node information.

    otherwise

    Call NamingProxy.refreshServerListFromDisk() below to get Raft cluster node information

  • Update the List < String > serverlistFromConfig attribute and List < String > servers attribute of NamingProxy after obtaining Raft cluster node information (i.e. ip:port list).

NamingProxy.refreshServerListFromDisk() Gets cluster node information

Read Raft cluster node information, i.e. ip:port list, from disk or system environment variables

2. Raft Cluster Data Recovery

When Nacos starts/restarts, it loads Datum and term data from disk for data recovery.

After the nacos server is started - > RaftCore. init () method - > RaftStore. load () method.

RaftStore.load()

  • Get Datum data from disk:

    Put Datum in the Concurrent Map < String, Datum > datums collection of RaftCore, and the key is Datum's key.

    Packing Datum and ApplyAction.CHANGE into Pair and placing it in Notifier's tasks queue to notify the relevant RaftListener;

  • The term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term term

    Call the RaftSet.setTerm(long term) method to update the term value of each node in the Raft cluster

3. Raft elections

GlobalExecutor.register(new MasterElection()) Register Election Timing Task

Nacos Raft elections are done through the Master Election thread task.

  • Update election timeout and heart timeout of candidate nodes.
  • Call MasterElection.sendVote() to vote.
public class MasterElection implements Runnable {
    @Override
    public void run() {
        try {
            if (!peers.isReady()) {
                return;
            }
 
            RaftPeer local = peers.local();
            local.leaderDueMs -= GlobalExecutor.TICK_PERIOD_MS;
            if (local.leaderDueMs > 0) {
                return;
            }
 
            // Reset election timeout, reset every heartbeat and packet received
            local.resetLeaderDue();
            local.resetHeartbeatDue();
 
            // Initiation of elections
            sendVote();
        } catch (Exception e) {
            Loggers.RAFT.warn("[RAFT] error while master election {}", e);
        }
    }
}

MasterElection.sendVote() Sends Timing Tasks

  • Reset Raft cluster data:

The leader is null; the voteFor field of all Raft nodes is null;

  • Update candidate node data:

term of office increases by 1; (by adding 1 to make the difference between terms of other nodes, it avoids that all nodes can not elect Leaders as terms do.)

The voteFor field of the candidate node is set to itself.

state is set to CANDIDATE;

  • Candidate nodes send HTTP POST requests to/v1/ns/raft/vote of all Raft nodes except themselves:

The content of the request is vote: JSON. to JSONString (local)

  • Candidate node receives candidate node data from other nodes and handles it to PeerSet. DecieLeader () method.

Set RaftPerr corresponding to more than half of voteFor s to Leader.

        public void sendVote() {

            RaftPeer local = peers.get(NetUtils.localServer());
            Loggers.RAFT.info("leader timeout, start voting,leader: {}, term: {}",
                JSON.toJSONString(getLeader()), local.term);

            //Reset Raft Cluster Data
            peers.reset();

            //Update candidate node data
            local.term.incrementAndGet();
            local.voteFor = local.ip;
            local.state = RaftPeer.State.CANDIDATE;


            //Candidate nodes send HTTP POST requests to / v1/ns/raft/vote of all Raft nodes except themselves
            //The content of the request is vote: JSON. to JSONString (local)
            Map<String, String> params = new HashMap<String, String>(1);
            params.put("vote", JSON.toJSONString(local));
            for (final String server : peers.allServersWithoutMySelf()) {
                final String url = buildURL(server, API_VOTE);
                try {
                    HttpClient.asyncHttpPost(url, null, params, new AsyncCompletionHandler<Integer>() {
                        @Override
                        public Integer onCompleted(Response response) throws Exception {
                            if (response.getStatusCode() != HttpURLConnection.HTTP_OK) {
                                Loggers.RAFT.error("NACOS-RAFT vote failed: {}, url: {}", response.getResponseBody(), url);
                                return 1;
                            }

                            RaftPeer peer = JSON.parseObject(response.getResponseBody(), RaftPeer.class);

                            Loggers.RAFT.info("received approve from peer: {}", JSON.toJSONString(peer));

                            //Candidate node receives candidate node data from other nodes and submits it to PeerSet. DecieLeader
                            //Method Processing
                            peers.decideLeader(peer);

                            return 0;
                        }
                    });
                } catch (Exception e) {
                    Loggers.RAFT.warn("error while sending vote to server: {}", server);
                }
            }
        }
    }

(1) RaftCommands.vote() handles / v1/ns/raft/vote requests

http interface for election requests

@RestController
@RequestMapping(UtilsAndCommons.NACOS_NAMING_CONTEXT + "/raft")
public class RaftController {
 
    ......
 
    @NeedAuth
    @RequestMapping(value = "/vote", method = RequestMethod.POST)
    public JSONObject vote(HttpServletRequest request, HttpServletResponse response) throws Exception {
        // Processing Election Requests
        RaftPeer peer = raftCore.receivedVote(
            JSON.parseObject(WebUtils.required(request, "vote"), RaftPeer.class));
 
        return JSON.parseObject(JSON.toJSONString(peer));
    }
 
 
    ......
}

Call the RaftCore.MasterElection.receivedVote() method

If the received candidate node term is smaller than the local node term, then:

Local node voteFor updates itself; (meaning I'm better suited to be a leader myself, and I vote for myself)

Otherwise:

This Follower resets its election timeout;

Update its voteFor s to receive the candidate node ip; (meaning do as you say, and this vote will be cast for you.) )

Update its term as the received candidate node term;

The local node is returned as an http response.

@Component
public class RaftCore {
 
    ......
 
    public RaftPeer receivedVote(RaftPeer remote) {
        if (!peers.contains(remote)) {
            throw new IllegalStateException("can not find peer: " + remote.ip);
        }
 
        // If the term of the current node is greater than or equal to the term of the node sending the election request, choose yourself as leader.
        RaftPeer local = peers.get(NetUtils.localServer());
        if (remote.term.get() <= local.term.get()) {
            String msg = "received illegitimate vote" +
                ", voter-term:" + remote.term + ", votee-term:" + local.term;
 
            Loggers.RAFT.info(msg);
            if (StringUtils.isEmpty(local.voteFor)) {
                local.voteFor = local.ip;
            }
 
            return local;
        }
 
        local.resetLeaderDue();
 
        // If the term of the current node is less than the term of the node sending the request, the node sending the request is chosen as leader.
        local.state = RaftPeer.State.FOLLOWER;
        local.voteFor = remote.ip;
        local.term.set(remote.term.get());
 
        Loggers.RAFT.info("vote {} as leader, term: {}", remote.ip, remote.term);
 
        return local;
    }
}

(2) PeerSet. DecieLeader () Election

@Component
@DependsOn("serverListManager")
public class RaftPeerSet implements ServerChangeListener {
 
    ......
 
    public RaftPeer decideLeader(RaftPeer candidate) {
        peers.put(candidate.ip, candidate);
 
        SortedBag ips = new TreeBag();
        int maxApproveCount = 0;
        String maxApprovePeer = null;
        // If voteFors are not empty, the voteFors of the nodes are added to the ips to record the number and number of nodes that have been elected the most.
        for (RaftPeer peer : peers.values()) {
            if (StringUtils.isEmpty(peer.voteFor)) {
                continue;
            }
 
            ips.add(peer.voteFor);
            if (ips.getCount(peer.voteFor) > maxApproveCount) {
                maxApproveCount = ips.getCount(peer.voteFor);
                maxApprovePeer = peer.voteFor;
            }
        }
 
        // Set the elected node to leader
        if (maxApproveCount >= majorityCount()) {
            RaftPeer peer = peers.get(maxApprovePeer);
            peer.state = RaftPeer.State.LEADER;
 
            if (!Objects.equals(leader, peer)) {
                leader = peer;
                Loggers.RAFT.info("{} has become the LEADER", leader.ip);
            }
        }
 
        return leader;
    }
}

4. Raft heartbeat

GlobalExecutor.register(new HeartBeat()) registers heartbeat timing tasks

  • Reset the heart timeout and election timeout of the Leader node;
  • sendBeat() sends heartbeat packets
public class HeartBeat implements Runnable {
    @Override
    public void run() {
        try {
            if (!peers.isReady()) {
                return;
            }
 
            RaftPeer local = peers.local();
            // Heartbeat DueMs defaults to 5s, TICK_PERIOD_MS to 500ms, checks every 500ms, and sends a heartbeat every 5S.
            local.heartbeatDueMs -= GlobalExecutor.TICK_PERIOD_MS;
            if (local.heartbeatDueMs > 0) {
                return;
            }
 
            // Reset heartbeat DueMs
            local.resetHeartbeatDue();
 
            // Send Heart Packet
            sendBeat();
        } catch (Exception e) {
            Loggers.RAFT.warn("[RAFT] error while sending beat {}", e);
        }
    }
}

HeartBeat.sendBeat() Sends Heartbeat Packets

  • Reset the heart timeout and election timeout of the Leader node;
  • Send an HTTP POST request to a node/v1/ns/raft/beat path other than itself. The request is as follows:

JSONObject packet = new JSONObject();

packet.put("peer", local); //local is the RaftPeer object corresponding to the Leader node

packet.put("datums", array); //array encapsulates all Datum key s and timestamp s in RaftCore

Map<String, String> params = new HashMap<String, String>(1);

params.put("beat", JSON.toJSONString(packet));

  • Get the http response returned by each node, namely RaftPeer object, update the Map < String, RaftPeer > peers set of PeerSet. (Keep cluster node data consistent)
    public void sendBeat() throws IOException, InterruptedException {
        RaftPeer local = peers.local();
        // Only leader sends heartbeat
        if (local.state != RaftPeer.State.LEADER && !STANDALONE_MODE) {
            return;
        }
 
        Loggers.RAFT.info("[RAFT] send beat with {} keys.", datums.size());
 
        // Replacement of the lead interval when the package is not received
        local.resetLeaderDue();
 
        // Build heartbeat package information, local for the current nacos node information, key for peer
        JSONObject packet = new JSONObject();
        packet.put("peer", local);
 
        JSONArray array = new JSONArray();
 
        // Send only heartbeat packets without data.
        if (switchDomain.isSendBeatOnly()) {
            Loggers.RAFT.info("[SEND-BEAT-ONLY] {}", String.valueOf(switchDomain.isSendBeatOnly()));
        }
 
        // Send related key s to follower via heartbeat packet
        if (!switchDomain.isSendBeatOnly()) {
            for (Datum datum : datums.values()) {
                JSONObject element = new JSONObject();
 
                // Put key s and their corresponding versions in element and eventually add them to array
                if (KeyBuilder.matchServiceMetaKey(datum.key)) {
                    element.put("key", KeyBuilder.briefServiceMetaKey(datum.key));
                } else if (KeyBuilder.matchInstanceListKey(datum.key)) {
                    element.put("key", KeyBuilder.briefInstanceListkey(datum.key));
                }
                element.put("timestamp", datum.timestamp);
 
                array.add(element);
            }
        } else {
            Loggers.RAFT.info("[RAFT] send beat only.");
        }
 
        // Put array s of all key s into the packet
        packet.put("datums", array);
         
        // Converting data packets into json strings and putting them into params
        Map<String, String> params = new HashMap<String, String>(1);
        params.put("beat", JSON.toJSONString(packet));
 
        String content = JSON.toJSONString(params);
 
        // Compression with gzip
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        GZIPOutputStream gzip = new GZIPOutputStream(out);
        gzip.write(content.getBytes("UTF-8"));
        gzip.close();
 
        byte[] compressedBytes = out.toByteArray();
        String compressedContent = new String(compressedBytes, "UTF-8");
        Loggers.RAFT.info("raw beat data size: {}, size of compressed data: {}",
            content.length(), compressedContent.length());
 
        // Send heartbeat packets to all follower s
        for (final String server : peers.allServersWithoutMySelf()) {
            try {
                final String url = buildURL(server, API_BEAT);
                Loggers.RAFT.info("send beat to server " + server);
                HttpClient.asyncHttpPostLarge(url, null, compressedBytes, new AsyncCompletionHandler<Integer>() {
                    @Override
                    public Integer onCompleted(Response response) throws Exception {
                        if (response.getStatusCode() != HttpURLConnection.HTTP_OK) {
                            Loggers.RAFT.error("NACOS-RAFT beat failed: {}, peer: {}",
                                response.getResponseBody(), server);
                            MetricsMonitor.getLeaderSendBeatFailedException().increment();
                            return 1;
                        }
                        peers.update(JSON.parseObject(response.getResponseBody(), RaftPeer.class));
                        Loggers.RAFT.info("receive beat response from: {}", url);
                        return 0;
                    }
 
                    @Override
                    public void onThrowable(Throwable t) {
                        Loggers.RAFT.error("NACOS-RAFT error while sending heart-beat to peer: {} {}", server, t);
                        MetricsMonitor.getLeaderSendBeatFailedException().increment();
                    }
                });
            } catch (Exception e) {
                Loggers.RAFT.error("error while sending heart-beat to peer: {} {}", server, e);
                MetricsMonitor.getLeaderSendBeatFailedException().increment();
            }
        }
    }

(.) RaftCommands.beat() method handles / v1/ns/raft/beat requests

The http interface for receiving heartbeat packets:

@RestController
@RequestMapping(UtilsAndCommons.NACOS_NAMING_CONTEXT + "/raft")
public class RaftController {
 
    ......
 
    @NeedAuth
    @RequestMapping(value = "/beat", method = RequestMethod.POST)
    public JSONObject beat(HttpServletRequest request, HttpServletResponse response) throws Exception {
        String entity = new String(IoUtils.tryDecompress(request.getInputStream()), "UTF-8");
        String value = URLDecoder.decode(entity, "UTF-8");
        value = URLDecoder.decode(value, "UTF-8");
 
        // Analysis of Heart Packet
        JSONObject json = JSON.parseObject(value);
        JSONObject beat = JSON.parseObject(json.getString("beat"));
 
        // Processing heartbeat packets and returning information from this node as response
        RaftPeer peer = raftCore.receivedBeat(beat);
        return JSON.parseObject(JSON.toJSONString(peer));
    }
 
    ......
}

HeartBeat.receivedBeat() handles heartbeat packets

  • If the node receiving the heartbeat is not the Follower role, it is set to the Follower role and its voteFor is set to the ip of the Leader node.
  • Reset the heart timeout and election timeout of the local node;
  • Calling PeerSet.makeLeader() notifies this node to update the Leader; (that is, the Leader node notifies other nodes to update the Leader by heartbeat)
  • Check Datum:

Traverse through the datums in the request parameters, and collect the datumKey if Follwoer does not have the datumKey or if the timestamp is old.

Every 50 datum keys are collected, requests are sent to the / v1/ns/raft/get path of the Leader node. The request parameters are 50 datum keys and 50 latest Datum objects are obtained.

Traversing through these Daum objects, the next step is to do something similar to what is done in the RaftCore.onPublish() method:
1. Call RaftStore write to serialize Datum into json and write it to cacheFile
2. Store Datum in RaftCore's datums collection with key as the key value of the above datum
3. Update election timeout of local nodes
4. Update the term term of the local node
5. term persistence of local node to properties file
6. Call notifier.addTask(datum, Notifier.ApplyAction.CHANGE);

Notify the corresponding RaftListener

RaftCore.deleteDatum(String key) is used to delete old Datum
Delete the Datum corresponding to the key in the datums collection;
RaftStore.delete(), delete the Datum file on disk;
notifier.addTask(deleted, Notifier.ApplyAction.DELETE) notifies the corresponding RaftListener of a DELETE event.

  • RaftPeer of the local node is returned as an http response.
@Component
public class RaftCore {
 
    ......
 
    public RaftPeer receivedBeat(JSONObject beat) throws Exception {
        final RaftPeer local = peers.local();
        // Parsing node information for sending heartbeat packets
        final RaftPeer remote = new RaftPeer();
        remote.ip = beat.getJSONObject("peer").getString("ip");
        remote.state = RaftPeer.State.valueOf(beat.getJSONObject("peer").getString("state"));
        remote.term.set(beat.getJSONObject("peer").getLongValue("term"));
        remote.heartbeatDueMs = beat.getJSONObject("peer").getLongValue("heartbeatDueMs");
        remote.leaderDueMs = beat.getJSONObject("peer").getLongValue("leaderDueMs");
        remote.voteFor = beat.getJSONObject("peer").getString("voteFor");
 
        // If the heartbeat packet received is not sent by the leader node, an exception is thrown
        if (remote.state != RaftPeer.State.LEADER) {
            Loggers.RAFT.info("[RAFT] invalid state from master, state: {}, remote peer: {}",
                remote.state, JSON.toJSONString(remote));
            throw new IllegalArgumentException("invalid state from master, state: " + remote.state);
        }
 
        // If the local term is larger than the term of the heartbeat packet, the heartbeat packet is not processed
        if (local.term.get() > remote.term.get()) {
            Loggers.RAFT.info("[RAFT] out of date beat, beat-from-term: {}, beat-to-term: {}, remote peer: {}, and leaderDueMs: {}"
                , remote.term.get(), local.term.get(), JSON.toJSONString(remote), local.leaderDueMs);
            throw new IllegalArgumentException("out of date beat, beat-from-term: " + remote.term.get()
                + ", beat-to-term: " + local.term.get());
        }
 
        // If the current node is not a follower node, it is updated to a follower node
        if (local.state != RaftPeer.State.FOLLOWER) {
            Loggers.RAFT.info("[RAFT] make remote as leader, remote peer: {}", JSON.toJSONString(remote));
            // mk follower
            local.state = RaftPeer.State.FOLLOWER;
            local.voteFor = remote.ip;
        }
 
        final JSONArray beatDatums = beat.getJSONArray("datums");
        // Update the heartbeat packet sending interval and the election interval when the heartbeat packet is not received
        local.resetLeaderDue();
        local.resetHeartbeatDue();
 
        // Update the leader information, set remote to the new leader, update the node information of the original leader
        peers.makeLeader(remote);
 
        // Keys of the current node are stored in a map with value s of 0
        Map<String, Integer> receivedKeysMap = new HashMap<String, Integer>(datums.size());
        for (Map.Entry<String, Datum> entry : datums.entrySet()) {
            receivedKeysMap.put(entry.getKey(), 0);
        }
 
        // Check the received datum list
        List<String> batch = new ArrayList<String>();
        if (!switchDomain.isSendBeatOnly()) {
            int processedCount = 0;
            Loggers.RAFT.info("[RAFT] received beat with {} keys, RaftCore.datums' size is {}, remote server: {}, term: {}, local term: {}",
                beatDatums.size(), datums.size(), remote.ip, remote.term, local.term);
            for (Object object : beatDatums) {
                processedCount = processedCount + 1;
 
                JSONObject entry = (JSONObject) object;
                String key = entry.getString("key");
                final String datumKey;
                // Build a datumKey (with a prefix, which is removed when the key is sent)
                if (KeyBuilder.matchServiceMetaKey(key)) {
                    datumKey = KeyBuilder.detailServiceMetaKey(key);
                } else if (KeyBuilder.matchInstanceListKey(key)) {
                    datumKey = KeyBuilder.detailInstanceListkey(key);
                } else {
                    // ignore corrupted key:
                    continue;
                }
 
                // Get the corresponding version of the received key
                long timestamp = entry.getLong("timestamp");
 
                // Mark the received key as 1 in the map of the local key
                receivedKeysMap.put(datumKey, 1);
 
                try {
                    // If the received key exists locally and the local version is larger than the received version and there is data unprocessed, continue directly
                    if (datums.containsKey(datumKey) && datums.get(datumKey).timestamp.get() >= timestamp && processedCount < beatDatums.size()) {
                        continue;
                    }
 
                    // If the received key is not available locally, or the local version is smaller than the received version, put it in batch and prepare for the next step to get the data.
                    if (!(datums.containsKey(datumKey) && datums.get(datumKey).timestamp.get() >= timestamp)) {
                        batch.add(datumKey);
                    }
 
                    // Only when the batch number exceeds 50 or has been processed is the data acquisition operation performed.
                    if (batch.size() < 50 && processedCount < beatDatums.size()) {
                        continue;
                    }
 
                    String keys = StringUtils.join(batch, ",");
 
                    if (batch.size() <= 0) {
                        continue;
                    }
 
                    Loggers.RAFT.info("get datums from leader: {}, batch size is {}, processedCount is {}, datums' size is {}, RaftCore.datums' size is {}"
                        , getLeader().ip, batch.size(), processedCount, beatDatums.size(), datums.size());
 
                    // Get the data for the corresponding key
                    // update datum entry
                    String url = buildURL(remote.ip, API_GET) + "?keys=" + URLEncoder.encode(keys, "UTF-8");
                    HttpClient.asyncHttpGet(url, null, null, new AsyncCompletionHandler<Integer>() {
                        @Override
                        public Integer onCompleted(Response response) throws Exception {
                            if (response.getStatusCode() != HttpURLConnection.HTTP_OK) {
                                return 1;
                            }
 
                            List<Datum> datumList = JSON.parseObject(response.getResponseBody(), new TypeReference<List<Datum>>() {
                            });
 
                            // Update local data
                            for (Datum datum : datumList) {
                                OPERATE_LOCK.lock();
                                try {
                                    Datum oldDatum = getDatum(datum.key);
 
                                    if (oldDatum != null && datum.timestamp.get() <= oldDatum.timestamp.get()) {
                                        Loggers.RAFT.info("[NACOS-RAFT] timestamp is smaller than that of mine, key: {}, remote: {}, local: {}",
                                            datum.key, datum.timestamp, oldDatum.timestamp);
                                        continue;
                                    }
 
                                    raftStore.write(datum);
 
                                    if (KeyBuilder.matchServiceMetaKey(datum.key)) {
                                        Datum<Service> serviceDatum = new Datum<>();
                                        serviceDatum.key = datum.key;
                                        serviceDatum.timestamp.set(datum.timestamp.get());
                                        serviceDatum.value = JSON.parseObject(JSON.toJSONString(datum.value), Service.class);
                                        datum = serviceDatum;
                                    }
 
                                    if (KeyBuilder.matchInstanceListKey(datum.key)) {
                                        Datum<Instances> instancesDatum = new Datum<>();
                                        instancesDatum.key = datum.key;
                                        instancesDatum.timestamp.set(datum.timestamp.get());
                                        instancesDatum.value = JSON.parseObject(JSON.toJSONString(datum.value), Instances.class);
                                        datum = instancesDatum;
                                    }
 
                                    datums.put(datum.key, datum);
                                    notifier.addTask(datum.key, ApplyAction.CHANGE);
 
                                    local.resetLeaderDue();
 
                                    if (local.term.get() + 100 > remote.term.get()) {
                                        getLeader().term.set(remote.term.get());
                                        local.term.set(getLeader().term.get());
                                    } else {
                                        local.term.addAndGet(100);
                                    }
 
                                    raftStore.updateTerm(local.term.get());
 
                                    Loggers.RAFT.info("data updated, key: {}, timestamp: {}, from {}, local term: {}",
                                        datum.key, datum.timestamp, JSON.toJSONString(remote), local.term);
 
                                } catch (Throwable e) {
                                    Loggers.RAFT.error("[RAFT-BEAT] failed to sync datum from leader, key: {} {}", datum.key, e);
                                } finally {
                                    OPERATE_LOCK.unlock();
                                }
                            }
                            TimeUnit.MILLISECONDS.sleep(200);
                            return 0;
                        }
                    });
 
                    batch.clear();
                } catch (Exception e) {
                    Loggers.RAFT.error("[NACOS-RAFT] failed to handle beat entry, key: {}", datumKey);
                }
            }
 
            // If a key exists locally but does not appear in the list of keys received, it proves that the leader has been deleted, and the local key must also be deleted.
            List<String> deadKeys = new ArrayList<String>();
            for (Map.Entry<String, Integer> entry : receivedKeysMap.entrySet()) {
                if (entry.getValue() == 0) {
                    deadKeys.add(entry.getKey());
                }
            }
 
            for (String deadKey : deadKeys) {
                try {
                    deleteDatum(deadKey);
                } catch (Exception e) {
                    Loggers.RAFT.error("[NACOS-RAFT] failed to remove entry, key={} {}", deadKey, e);
                }
            }
        }
 
        return local;
    }
}

5. Raft publishes content

Registration Entry

Register http interface

@RestController
@RequestMapping(UtilsAndCommons.NACOS_NAMING_CONTEXT + "/instance")
public class InstanceController {
 
    ......
 
    @CanDistro
    @RequestMapping(value = "", method = RequestMethod.POST)
    public String register(HttpServletRequest request) throws Exception {
        // Get namespace and serviceName
        String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
        String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);
 
        // Execute registration logic
        serviceManager.registerInstance(namespaceId, serviceName, parseInstance(request));
        return "ok";
    }
}

Examples of registration

@Component
@DependsOn("nacosApplicationContext")
public class ServiceManager implements RecordListener<Service> {
 
    ......
 
    private Map<String, Map<String, Service>> serviceMap = new ConcurrentHashMap<>();
 
 
    ......
 
    // Register new instances
    public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {
        // Create empty service, all services are stored in service Map, service Map type is: Map < String, Map < String, Service >, the key of the first layer map is namespace, the key of the second layer map is service Name;
        // A clusterMap is maintained in each service, and two set s in the clusterMap are used to store instance s.
        if (ServerMode.AP.name().equals(switchDomain.getServerMode())) {
            createEmptyService(namespaceId, serviceName);
        }
 
        Service service = getService(namespaceId, serviceName);
 
        if (service == null) {
            throw new NacosException(NacosException.INVALID_PARAM,
                "service not found, namespace: " + namespaceId + ", service: " + serviceName);
        }
 
        // Check if the instance exists and compare it over ip
        if (service.allIPs().contains(instance)) {
            throw new NacosException(NacosException.INVALID_PARAM, "instance already exist: " + instance);
        }
 
        // Add a new instance
        addInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
    }
 
 
    // Create an empty service
    public void createEmptyService(String namespaceId, String serviceName) throws NacosException {
        Service service = getService(namespaceId, serviceName);
        if (service == null) {
            service = new Service();
            service.setName(serviceName);
            service.setNamespaceId(namespaceId);
            service.setGroupName(Constants.DEFAULT_GROUP);
            // now validate the service. if failed, exception will be thrown
            service.setLastModifiedMillis(System.currentTimeMillis());
            service.recalculateChecksum();
            service.validate();
            putService(service);
            service.init();
            // Add service monitoring to synchronize data
            consistencyService.listen(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), true), service);
            consistencyService.listen(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), false), service);
        }
    }
 
    // Add instance to the cache and persist
    public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips) throws NacosException {
        String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);
 
        Service service = getService(namespaceId, serviceName);
 
        // Add instance to local cache
        List<Instance> instanceList = addIpAddresses(service, ephemeral, ips);
 
        Instances instances = new Instances();
        instances.setInstanceList(instanceList);
 
        // Persistence of instance information
        consistencyService.put(key, instances);
    }
 
    // Add instances to the cache
    public List<Instance> addIpAddresses(Service service, boolean ephemeral, Instance... ips) throws NacosException {
        return updateIpAddresses(service, UtilsAndCommons.UPDATE_INSTANCE_ACTION_ADD, ephemeral, ips);
    }
 
    // Real logic to add instances to caches
    public List<Instance> updateIpAddresses(Service service, String action, boolean ephemeral, Instance... ips) throws NacosException {
        Datum datum = consistencyService.get(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), ephemeral));
 
        Map<String, Instance> oldInstanceMap = new HashMap<>(16);
        List<Instance> currentIPs = service.allIPs(ephemeral);
        Map<String, Instance> map = new ConcurrentHashMap<>(currentIPs.size());
 
        for (Instance instance : currentIPs) {
            map.put(instance.toIPAddr(), instance);
        }
        if (datum != null) {
            oldInstanceMap = setValid(((Instances) datum.value).getInstanceList(), map);
        }
 
        // use HashMap for deep copy:
        HashMap<String, Instance> instanceMap = new HashMap<>(oldInstanceMap.size());
        instanceMap.putAll(oldInstanceMap);
 
        for (Instance instance : ips) {
            if (!service.getClusterMap().containsKey(instance.getClusterName())) {
                Cluster cluster = new Cluster(instance.getClusterName());
                cluster.setService(service);
                service.getClusterMap().put(instance.getClusterName(), cluster);
                Loggers.SRV_LOG.warn("cluster: {} not found, ip: {}, will create new cluster with default configuration.",
                    instance.getClusterName(), instance.toJSON());
            }
 
            if (UtilsAndCommons.UPDATE_INSTANCE_ACTION_REMOVE.equals(action)) {
                instanceMap.remove(instance.getDatumKey());
            } else {
                instanceMap.put(instance.getDatumKey(), instance);
            }
        }
 
        if (instanceMap.size() <= 0 && UtilsAndCommons.UPDATE_INSTANCE_ACTION_ADD.equals(action)) {
            throw new IllegalArgumentException("ip list can not be empty, service: " + service.getName() + ", ip list: "
                + JSON.toJSONString(instanceMap.values()));
        }
 
        return new ArrayList<>(instanceMap.values());
    }
 
    // Merge the old instance list with the new instance
    private Map<String, Instance> setValid(List<Instance> oldInstances, Map<String, Instance> map) {
        Map<String, Instance> instanceMap = new HashMap<>(oldInstances.size());
        for (Instance instance : oldInstances) {
            Instance instance1 = map.get(instance.toIPAddr());
            if (instance1 != null) {
                instance.setHealthy(instance1.isHealthy());
                instance.setLastBeat(instance1.getLastBeat());
            }
            instanceMap.put(instance.getDatumKey(), instance);
        }
        return instanceMap;
    }
 
    ......
}

Instance information persistence

The RaftConsistencyService Impl. put () method is used to do the persistence of instance information, that is, consistencyService.put(key, instances) mentioned above; this step

(1)Service.put()

@Service
public class RaftConsistencyServiceImpl implements PersistentConsistencyService {
 
    ......
 
    @Override
    public void put(String key, Record value) throws NacosException {
        try {
            raftCore.signalPublish(key, value);
        } catch (Exception e) {
            Loggers.RAFT.error("Raft put failed.", e);
            throw new NacosException(NacosException.SERVER_ERROR, "Raft put failed, key:" + key + ", value:" + value);
        }
    }
}

Finally, the signalPublish() method to RaftCore is called:

(2)RaftCore.signalPublish()

@Component
public class RaftCore {
 
    ......
 
    public void signalPublish(String key, Record value) throws Exception {
        // If it's not the leader, forward the package directly to the leader
        if (!isLeader()) {
            JSONObject params = new JSONObject();
            params.put("key", key);
            params.put("value", value);
            Map<String, String> parameters = new HashMap<>(1);
            parameters.put("key", key);
 
            // Call the / raft/datum interface
            raftProxy.proxyPostLarge(getLeader().ip, API_PUB, params.toJSONString(), parameters);
            return;
        }
 
        // If leader, send the package to all follower s
        try {
            OPERATE_LOCK.lock();
            long start = System.currentTimeMillis();
            final Datum datum = new Datum();
            datum.key = key;
            datum.value = value;
            if (getDatum(key) == null) {
                datum.timestamp.set(1L);
            } else {
                datum.timestamp.set(getDatum(key).timestamp.incrementAndGet());
            }
 
            JSONObject json = new JSONObject();
            json.put("datum", datum);
            json.put("source", peers.local());
 
            // The local onPublish method is used to handle persistence logic
            onPublish(datum, peers.local());
 
            final String content = JSON.toJSONString(json);
 
            final CountDownLatch latch = new CountDownLatch(peers.majorityCount());
            // Send the package to all follower s, calling the / raft/datum/commit interface
            for (final String server : peers.allServersIncludeMyself()) {
                if (isLeader(server)) {
                    latch.countDown();
                    continue;
                }
                final String url = buildURL(server, API_ON_PUB);
                HttpClient.asyncHttpPostLarge(url, Arrays.asList("key=" + key), content, new AsyncCompletionHandler<Integer>() {
                    @Override
                    public Integer onCompleted(Response response) throws Exception {
                        if (response.getStatusCode() != HttpURLConnection.HTTP_OK) {
                            Loggers.RAFT.warn("[RAFT] failed to publish data to peer, datumId={}, peer={}, http code={}",
                                datum.key, server, response.getStatusCode());
                            return 1;
                        }
                        latch.countDown();
                        return 0;
                    }
 
                    @Override
                    public STATE onContentWriteCompleted() {
                        return STATE.CONTINUE;
                    }
                });
            }
 
            if (!latch.await(UtilsAndCommons.RAFT_PUBLISH_TIMEOUT, TimeUnit.MILLISECONDS)) {
                // only majority servers return success can we consider this update success
                Loggers.RAFT.info("data publish failed, caused failed to notify majority, key={}", key);
                throw new IllegalStateException("data publish failed, caused failed to notify majority, key=" + key);
            }
 
            long end = System.currentTimeMillis();
            Loggers.RAFT.info("signalPublish cost {} ms, key: {}", (end - start), key);
        } finally {
            OPERATE_LOCK.unlock();
        }
    }
}

(3)/raft/datum interface and/raft/datum/commit interface

@RestController
@RequestMapping(UtilsAndCommons.NACOS_NAMING_CONTEXT + "/raft")
public class RaftController {
 
    ......
 
    @NeedAuth
    @RequestMapping(value = "/datum", method = RequestMethod.POST)
    public String publish(HttpServletRequest request, HttpServletResponse response) throws Exception {
 
        response.setHeader("Content-Type", "application/json; charset=" + getAcceptEncoding(request));
        response.setHeader("Cache-Control", "no-cache");
        response.setHeader("Content-Encode", "gzip");
 
        String entity = IOUtils.toString(request.getInputStream(), "UTF-8");
        String value = URLDecoder.decode(entity, "UTF-8");
        JSONObject json = JSON.parseObject(value);
 
        // Here, RaftConsistencyServiceImpl.put() is also called for processing, and the logic of service registration is rounded here, eventually calling the signalPublish method.
        String key = json.getString("key");
        if (KeyBuilder.matchInstanceListKey(key)) {
            raftConsistencyService.put(key, JSON.parseObject(json.getString("value"), Instances.class));
            return "ok";
        }
 
        if (KeyBuilder.matchSwitchKey(key)) {
            raftConsistencyService.put(key, JSON.parseObject(json.getString("value"), SwitchDomain.class));
            return "ok";
        }
 
        if (KeyBuilder.matchServiceMetaKey(key)) {
            raftConsistencyService.put(key, JSON.parseObject(json.getString("value"), Service.class));
            return "ok";
        }
 
        throw new NacosException(NacosException.INVALID_PARAM, "unknown type publish key: " + key);
    }
 
 
    @NeedAuth
    @RequestMapping(value = "/datum/commit", method = RequestMethod.POST)
    public String onPublish(HttpServletRequest request, HttpServletResponse response) throws Exception {
        response.setHeader("Content-Type", "application/json; charset=" + getAcceptEncoding(request));
        response.setHeader("Cache-Control", "no-cache");
        response.setHeader("Content-Encode", "gzip");
 
        String entity = IOUtils.toString(request.getInputStream(), "UTF-8");
        String value = URLDecoder.decode(entity, "UTF-8");
        JSONObject jsonObject = JSON.parseObject(value);
        String key = "key";
 
        RaftPeer source = JSON.parseObject(jsonObject.getString("source"), RaftPeer.class);
        JSONObject datumJson = jsonObject.getJSONObject("datum");
 
        Datum datum = null;
        if (KeyBuilder.matchInstanceListKey(datumJson.getString(key))) {
            datum = JSON.parseObject(jsonObject.getString("datum"), new TypeReference<Datum<Instances>>() {});
        } else if (KeyBuilder.matchSwitchKey(datumJson.getString(key))) {
            datum = JSON.parseObject(jsonObject.getString("datum"), new TypeReference<Datum<SwitchDomain>>() {});
        } else if (KeyBuilder.matchServiceMetaKey(datumJson.getString(key))) {
            datum = JSON.parseObject(jsonObject.getString("datum"), new TypeReference<Datum<Service>>() {});
        }
 
        // This method is finally called to the onPublish method
        raftConsistencyService.onPut(datum, source);
        return "ok";
    }
 
    ......
}

Publish Entry RaftCommands.publish()

@Component
public class RaftCore {
 
    ......
 
    public void onPublish(Datum datum, RaftPeer source) throws Exception {
        RaftPeer local = peers.local();
        if (datum.value == null) {
            Loggers.RAFT.warn("received empty datum");
            throw new IllegalStateException("received empty datum");
        }
 
        // If the package is not published by leader, throw an exception
        if (!peers.isLeader(source.ip)) {
            Loggers.RAFT.warn("peer {} tried to publish data but wasn't leader, leader: {}",
                JSON.toJSONString(source), JSON.toJSONString(getLeader()));
            throw new IllegalStateException("peer(" + source.ip + ") tried to publish " +
                "data but wasn't leader");
        }
 
        // The source term is smaller than the local current term and throws an exception
        if (source.term.get() < local.term.get()) {
            Loggers.RAFT.warn("out of date publish, pub-term: {}, cur-term: {}",
                JSON.toJSONString(source), JSON.toJSONString(local));
            throw new IllegalStateException("out of date publish, pub-term:"
                + source.term.get() + ", cur-term: " + local.term.get());
        }
 
        // Update election timeouts
        local.resetLeaderDue();
 
        // Node information persistence
        // if data should be persistent, usually this is always true:
        if (KeyBuilder.matchPersistentKey(datum.key)) {
            raftStore.write(datum);
        }
 
        // Add to Cache
        datums.put(datum.key, datum);
 
        // Update term information
        if (isLeader()) {
            local.term.addAndGet(PUBLISH_TERM_INCREASE_COUNT);
        } else {
            if (local.term.get() + PUBLISH_TERM_INCREASE_COUNT > source.term.get()) {
                //set leader term:
                getLeader().term.set(source.term.get());
                local.term.set(getLeader().term.get());
            } else {
                local.term.addAndGet(PUBLISH_TERM_INCREASE_COUNT);
            }
        }
        raftStore.updateTerm(local.term.get());
 
        // Notify the application node that information has changed
        notifier.addTask(datum.key, ApplyAction.CHANGE);
 
        Loggers.RAFT.info("data added/updated, key={}, term={}", datum.key, local.term);
    }
}

6. Raft guarantees content consistency

Nacos publishes content through Raft, which only exists on the Leader node and ensures consistency through Raft heartbeat mechanism.

When registering information, addInstance() method adds instance to the local cache, but when raft synchronizes data from leader to follower, follower receives the package and persists through onPublish() method, instead of updating the information to the local cache, it is implemented through a listener:

At the end of the onPublish method, there is a line: notifier.addTask(datum.key, ApplyAction.CHANGE); that is, add this change to the notification task, and let's see how the notification task will be handled:

@Component
public class RaftCore {
 
    ......
 
    public class Notifier implements Runnable {
        private ConcurrentHashMap<String, String> services = new ConcurrentHashMap<>(10 * 1024);
        private BlockingQueue<Pair> tasks = new LinkedBlockingQueue<Pair>(1024 * 1024);
 
        // Add change tasks to task queue
        public void addTask(String datumKey, ApplyAction action) {
 
            if (services.containsKey(datumKey) && action == ApplyAction.CHANGE) {
                return;
            }
            if (action == ApplyAction.CHANGE) {
                services.put(datumKey, StringUtils.EMPTY);
            }
            tasks.add(Pair.with(datumKey, action));
        }
 
        public int getTaskSize() {
            return tasks.size();
        }
 
        // Processing task threads
        @Override
        public void run() {
            Loggers.RAFT.info("raft notifier started");
 
            while (true) {
                try {
                    Pair pair = tasks.take();
 
                    if (pair == null) {
                        continue;
                    }
 
                    String datumKey = (String) pair.getValue0();
                    ApplyAction action = (ApplyAction) pair.getValue1();
 
                    // Delete the key from the service list
                    services.remove(datumKey);
 
                    int count = 0;
 
                    if (listeners.containsKey(KeyBuilder.SERVICE_META_KEY_PREFIX)) {
                        if (KeyBuilder.matchServiceMetaKey(datumKey) && !KeyBuilder.matchSwitchKey(datumKey)) {
                            for (RecordListener listener : listeners.get(KeyBuilder.SERVICE_META_KEY_PREFIX)) {
                                try {
                                    // Depending on the type of change, different callback methods are invoked to update the cache
                                    if (action == ApplyAction.CHANGE) {
                                        listener.onChange(datumKey, getDatum(datumKey).value);
                                    }
 
                                    if (action == ApplyAction.DELETE) {
                                        listener.onDelete(datumKey);
                                    }
                                } catch (Throwable e) {
                                    Loggers.RAFT.error("[NACOS-RAFT] error while notifying listener of key: {} {}", datumKey, e);
                                }
                            }
                        }
                    }
 
                    if (!listeners.containsKey(datumKey)) {
                        continue;
                    }
 
                    for (RecordListener listener : listeners.get(datumKey)) {
                        count++;
 
                        try {
                            if (action == ApplyAction.CHANGE) {
                                listener.onChange(datumKey, getDatum(datumKey).value);
                                continue;
                            }
 
                            if (action == ApplyAction.DELETE) {
                                listener.onDelete(datumKey);
                                continue;
                            }
                        } catch (Throwable e) {
                            Loggers.RAFT.error("[NACOS-RAFT] error while notifying listener of key: {} {}", datumKey, e);
                        }
                    }
 
                    if (Loggers.RAFT.isDebugEnabled()) {
                        Loggers.RAFT.debug("[NACOS-RAFT] datum change notified, key: {}, listener count: {}", datumKey, count);
                    }
                } catch (Throwable e) {
                    Loggers.RAFT.error("[NACOS-RAFT] Error while handling notifying task", e);
                }
            }
        }
    }
}

Topics: PHP JSON Attribute less