skynet cluster learning
Before learning about cluster, take a look at cluster1 under example Lua and cluster2 Lua example. To facilitate understanding, I have made corresponding modifications to these two examples:
--cluster1.lua local skynet = require "skynet" local cluster = require "skynet.cluster" local snax = require "skynet.snax" require "skynet.manager" skynet.start(function() cluster.reload { db = "127.0.0.1:2528", db2 = "127.0.0.1:2529", } local sdb = skynet.newservice("simpledb") skynet.name("sdb", sdb) print(skynet.call(sdb, "lua", "SET", "a", "foobar")) cluster.open "db" cluster.open "db2" end)
--cluster2.lua local skynet = require "skynet" local cluster = require "skynet.cluster" skynet.start(function() print(cluster.call("db", "sdb", "GET", "a")) end)
Now let's analyze it in detail
--cluster1.lua cluster.reload { db = "127.0.0.1:2528", db2 = "127.0.0.1:2529", } --clusterd.lua local function loadconfig(tmp) ... for name,address in pairs(tmp) do assert(address == false or type(address) == "string") if node_address[name] ~= address then -- address changed,use rawget This is to avoid touching the access to the meta table if rawget(node_channel, name) then node_channel[name] = nil -- reset connection end node_address[name] = address end ... end end function command.reload(source, config) loadconfig(config) skynet.ret(skynet.pack(nil)) end
cluster.reload is mainly used to save the node name and its equivalent address to the table node_address is used for subsequent remote requests, such as cluster Send or cluster call.
--cluster1.lua local sdb = skynet.newservice("simpledb") skynet.name("sdb", sdb) --clusterd.lua local register_name = {} function command.register(source, name, addr) assert(register_name[name] == nil) addr = addr or source local old_name = register_name[addr] if old_name then register_name[old_name] = nil end register_name[addr] = name register_name[name] = addr skynet.ret(nil) skynet.error(string.format("Register [%s] :%08x", name, addr)) end
Create a simpledb service and give the addr of the SDB service an alias "SDB". Here I have made a slight modification and do not use the cluster of the original example register("sdb", sdb) . Mainly convenient for cluster2 Lua uses node name + service name for remote access.
--cluster1.lua cluster.open "db" cluster.open "db2" --cluster.lua function cluster.open(port) if type(port) == "string" then skynet.call(clusterd, "lua", "listen", port) else skynet.call(clusterd, "lua", "listen", "0.0.0.0", port) end end --clusterd.lua function command.listen(source, addr, port) local gate = skynet.newservice("gate") if port == nil then local address = assert(node_address[addr], addr .. " is down") addr, port = string.match(address, "([^:]+):(.*)$") end skynet.call(gate, "lua", "open", { address = addr, port = port }) skynet.ret(skynet.pack(nil)) end --gate.lua ... gateserver.start(handler) --gateserver.lua function gateserver.start(handler) assert(handler.message) assert(handler.connect) function CMD.open( source, conf ) ... skynet.error(string.format("Listen on %s:%d", address, port)) socket = socketdriver.listen(address, port) --monitor ip Address and port Port number socketdriver.start(socket) if handler.open then return handler.open(source, conf) end end ...
Next is cluster Open "db" and "db2" through node_address to get the addr of db and db2 saved before it, then create a gate gateway and call the open method of gate, because the message distribution function of gate is written in gateserver Lua file, so it's called Skynet Call (gate, "Lua", "open", {address = addr, port = port}) actually runs to the gateserver In the open method in Lua, when the call is completed, it starts to listen for the address and port number.
Next, take a look at cluster2 How does Lua call cluster1 remotely Luad's:
--cluster2.lua print(cluster.call("db", "sdb", "GET", "a"))
The remote call here is also very simple. You only need to know cluster1 The listening address of the Lua node and what services it provides (find it by the alias from skynet.name).
Next, let's look at "cluster" How does call ("DB", "sdb", "get", "a") send data to the sdb service of cluster1 node.
--cluster.lua function cluster.call(node, address, ...) return skynet.call(clusterd, "lua", "req", node, address, skynet.pack(...)) end --clusterd.lua local function send_request(source, node, addr, msg, sz) local session = node_session[node] or 1 -- msg is a local pointer, cluster.packrequest will free it local request, new_session, padding = cluster.packrequest(addr, session, msg, sz) node_session[node] = new_session -- node_channel[node] may yield or throw error local c = node_channel[node] return c:request(request, session, padding) end function command.req(...) local ok, msg, sz = pcall(send_request, ...) if ok then if type(msg) == "table" then skynet.ret(cluster.concat(msg)) else skynet.ret(msg) end else skynet.error(msg) skynet.response()(false) end end
Here, first pack the user's data Skynet pack(...), For Skynet How pack packs data will be described in detail in later chapters due to limited space. You only need to know that it will return a user-defined type msg and length sz after packaging the data.
cluster.packrequest(addr, session, msg, sz) is for Skynet The msg obtained after package is packaged again, which is to add the header information and put it back into a new piece of memory.
--lua-cluster.c //Macro definition #define TEMP_LENGTH 0x8200 / / decimal 33280 #define MULTI_PART 0x8000 / / decimal 32768 // The session is packaged and takes buf 4 bytes static void fill_uint32(uint8_t * buf, uint32_t n) { buf[0] = n & 0xff; buf[1] = (n >> 8) & 0xff; buf[2] = (n >> 16) & 0xff; buf[3] = (n >> 24) & 0xff; } //Package the message length, occupying buf 2 bytes static void fill_header(lua_State *L, uint8_t *buf, int sz) { assert(sz < 0x10000); buf[0] = (sz >> 8) & 0xff; //sz shifts 8 bits to the left to obtain the high 8-bit data buf[1] = sz & 0xff; //sz & 0xff, shield the high-order data and get the low-order 8-bit data of sz } static int packreq_string(lua_State *L, int session, void * msg, uint32_t sz, int is_push) { size_t namelen = 0; const char *name = lua_tolstring(L, 1, &namelen); if (name == NULL || namelen < 1 || namelen > 255) { skynet_free(msg); luaL_error(L, "name is too long %s", name); } uint8_t buf[TEMP_LENGTH]; if (sz < MULTI_PART) { fill_header(L, buf, sz+6+namelen); buf[2] = 0x80; buf[3] = (uint8_t)namelen; memcpy(buf+4, name, namelen); fill_uint32(buf+4+namelen, is_push ? 0 : (uint32_t)session); memcpy(buf+8+namelen,msg,sz); lua_pushlstring(L, (const char *)buf, sz+8+namelen); return 0; } else { int part = (sz - 1) / MULTI_PART + 1; fill_header(L, buf, 10+namelen); buf[2] = is_push ? 0xc1 : 0x81; // multi push or request buf[3] = (uint8_t)namelen; memcpy(buf+4, name, namelen); fill_uint32(buf+4+namelen, (uint32_t)session); fill_uint32(buf+8+namelen, sz); lua_pushlstring(L, (const char *)buf, 12+namelen); return part; } } static int packrequest(lua_State *L, int is_push) { void *msg = lua_touserdata(L,3); if (msg == NULL) { return luaL_error(L, "Invalid request message"); } uint32_t sz = (uint32_t)luaL_checkinteger(L,4); int session = luaL_checkinteger(L,2); ... int addr_type = lua_type(L,1); int multipak; ... multipak = packreq_string(L, session, msg, sz, is_push); ... uint32_t new_session = (uint32_t)session + 1; ... lua_pushinteger(L, new_session); ... skynet_free(msg); return 2; ... } static int lpackrequest(lua_State *L) { return packrequest(L, 0); } static int lpackpush(lua_State *L) { return packrequest(L, 1); }
Because the packrequest function will judge whether the addr address is a number or a string, and then package it according to its type. Here, take addr as a string as an example. First, start with "cluster" Get the third parameter msg in packrequest (addr, session, msg, sz). Judge that if msg is empty, there is no need to pack again. Then get the fourth parameter sz and the second parameter session. Here, add one to the session to get a new new one_ Session, the purpose is new_session is used to identify the remote session record, which is reflected in the code above
node_session[node] = new_session
packreq_string(L, session, msg, sz, is_push); The SZ length will be judged if it is greater than MULTI_PART (32k bytes) and the second parameter of packrequest is 0, which indicates that rpc is a request + response process, then buf[2] = 0x81. If it is 1, it means that this request is pushed and no reply is required. Then buf[2] = 0xc1, and it will be sent multiple times according to part. If SZ is less than 32k bytes, it's easy to do. The contents stored in buf are as follows:
- Bytes 0 ~ 1: degree information (msg message length + 5+namelen service name length)
- 2nd byte: type
- 3rd byte: length of service name
- The 4th ~ namelen bytes: service name
- The 4th + namelen ~ 4 + namelen + 4 bytes (4 bytes): session
- Then you can store the contents of msg messages
At this point, after the whole c layer is called, buf and new will be obtained_ Session (padding exists only when sz is greater than 32k).
--clusterd.lua local function open_channel(t, key) ... local address = node_address[key] --stay ... if address then local host, port = string.match(address, "([^:]+):(.*)$") c = sc.channel { host = host, port = tonumber(port), response = read_response, nodelay = true, } succ, err = pcall(c.connect, c, true) -- Initiate remote connection if succ then t[key] = c ct.channel = c end else err = "cluster node [" .. key .. "] is down." end ... return c end --set up node_channel Meta table is open_channel local node_channel = setmetatable({}, { __index = open_channel }) local function send_request(source, node, addr, msg, sz) local session = node_session[node] or 1 -- msg is a local pointer, cluster.packrequest will free it local request, new_session, padding = cluster.packrequest(addr, session, msg, sz) node_session[node] = new_session --Then analyze the following two lines of code -- node_channel[node] may yield or throw error local c = node_channel[node] return c:request(request, session, padding) end
At this point, in the execution of {local C = node_ The request for remote connection has been initiated when channel [node]. Why? When learning lua syntax, there is a concept of meta table. If the index key cannot be found in this table and there is a meta table, it will go to the meta table and find it again. At this point, open is called_ Channel method to initiate a remote connection.
The sending function of connection request is in socketchannel Lua file:
--socketchannel.lua local function connect_once(self) ... local fd,err = socket.open(self.__host, self.__port) --call c Underlying network API ... end local function try_connect(self , once) local t = 0 while not self.__closed do local ok, err = connect_once(self) ... end end local function block_connect(self, once) ... if #self.__connecting > 0 then -- connecting in other coroutine local co = coroutine.running() table.insert(self.__connecting, co) skynet.wait(co) else self.__connecting[1] = true err = try_connect(self, once) self.__connecting[1] = nil for i=2, #self.__connecting do local co = self.__connecting[i] self.__connecting[i] = nil skynet.wakeup(co) end end ... end function channel:connect(once) ... return block_connect(self, once) end
Next, let's look at the functions called by c:request(request, session, padding):
--socketchannel.lua function channel:request(request, response, padding) assert(block_connect(self, true)) -- connect once Since the connection has been initiated before, you can rest assured that there will be no connection here. local fd = self.__sock[1] if padding then -- padding may be a table, to support multi part request -- multi part request use low priority socket write -- now socket_lwrite returns as socket_write if not socket_lwrite(fd , request) then sock_err(self) end --This will be greater than 32 before k The packet is sent multiple times for _,v in ipairs(padding) do if not socket_lwrite(fd, v) then sock_err(self) end end else --Less than 32 k All packets are sent at one time if not socket_write(fd , request) then sock_err(self) end end if response == nil then -- no response return end --After sending the data, you have to suspend the current collaboration and wait for the other party to respond to the message. return wait_for_response(self, response) end
Waiting for function wait_ for_ What has been done in response().
--socketchannel.lua local function wait_for_response(self, response) local co = coroutine.running() push_response(self, response, co) skynet.wait(co) --Suspend current collaboration local result = self.__result[co] -- Store this co Error code for self.__result[co] = nil local result_data = self.__result_data[co] --Store the data returned by the remote service. Here is the result data you want most self.__result_data[co] = nil if result == socket_error then if result_data then error(result_data) else error(socket_error) end else assert(result, result_data) return result_data --If there is no error in the remote call, the data is returned, and the data is still packaged. end end
So you may ask, since it's suspended, when will it be awakened? Remember what I said before
local function open_channel(t, key) function? There is a code in this function:
--clusterd.lua c = sc.channel { host = host, port = tonumber(port), response = read_response, --Set the callback function to read the response result nodelay = true, }
This is what the read response function does. Then take a closer look:
--clusterd.lua local function read_response(sock) local sz = socket.header(sock:read(2)) --Blocked read socket data local msg = sock:read(sz) --Read content return cluster.unpackresponse(msg) -- session, ok, data, padding We'll talk about it later end --socketchannel.lua local function dispatch_by_session(self) local response = self.__response -- response() return session while self.__sock do --there response Function, which is set before read_response Function. --It will be blocked until the callback function returns and waits for the result. local ok , session, result_ok, result_data, padding = pcall(response, self.__sock) --there result_data That's what the other party responded to, after skynet.pack pack. if ok and session then local co = self.__thread[session] if co then if padding and result_ok then -- If padding is true, append result_data to a table (self.__result_data[co]) local result = self.__result_data[co] or {} self.__result_data[co] = result table.insert(result, result_data) else self.__thread[session] = nil self.__result[co] = result_ok if result_ok and self.__result_data[co] then table.insert(self.__result_data[co], result_data) else self.__result_data[co] = result_data end skynet.wakeup(co) --Woke up here, wait_for_response Function can go down end ... end end exit_thread(self) end local function dispatch_function(self) if self.__response then return dispatch_by_session --Assuming that a response result is required, this function will be returned (according to cluster.call (decision) else return dispatch_by_order --Assuming that no response result is required, this function will be returned (according to cluster.send (decision) end end local function connect_once(self) ... --fork A coroutine comes out and is executed in the next frame --This is the key entry to wait for the response result self.__dispatch_thread = skynet.fork(dispatch_function(self), self) ... end
Here, another key call of c layer, read, is involved_ Function in response.cluster Unpackresponse (MSG) and see what it does:
//lua-cluster.c static int lunpackresponse(lua_State *L) { size_t sz; const char * buf = luaL_checklstring(L, 1, &sz); if (sz < 5) { return 0; } uint32_t session = unpack_uint32((const uint8_t *)buf); //The session takes up 4 bytes and corresponds to the package one by one lua_pushinteger(L, (lua_Integer)session); //Push the session into the stack as the first returned data of the function switch(buf[4]) { case 0: // error lua_pushboolean(L, 0); lua_pushlstring(L, buf+5, sz-5); return 3; case 1: // ok case 4: // multi end lua_pushboolean(L, 1); lua_pushlstring(L, buf+5, sz-5); return 3; case 2: // multi begin if (sz != 9) { return 0; } sz = unpack_uint32((const uint8_t *)buf+5); lua_pushboolean(L, 1); lua_pushinteger(L, sz); lua_pushboolean(L, 1); return 4; case 3: // multi part lua_pushboolean(L, 1); lua_pushlstring(L, buf+5, sz-5); lua_pushboolean(L, 1); return 4; default: return 0; } }
The lunpackresponse function mainly unpacks the msg content in the first layer, mainly according to the header message. Baotou has:
- 0 ~ 3 bytes: session
- 4th byte: type
- Start of the 5th byte: Skynet Pack the contents of the package
- The last byte (there may or may not be. If the main packet does not exceed 32k, there will be no): padding
Here, we can see the returned results by returning layer by layer:
skynet.lua function skynet.call(addr, typename, ...) local p = proto[typename] local session = c.send(addr, p.id , nil , p.pack(...)) if session == nil then error("call to invalid address " .. skynet.address(addr)) end return p.unpack(yield_call(addr, session)) --Here, we will unpack the second layer, and finally the remote response result that the user wants end --clusterd.lua function command.req(...) local ok, msg, sz = pcall(send_request, ...) if ok then --Data original return if type(msg) == "table" then skynet.ret(cluster.concat(msg)) else skynet.ret(msg) end ... end --cluster.lua function cluster.call(node, address, ...) -- skynet.pack(...) will free by cluster.core.packrequest return skynet.call(clusterd, "lua", "req", node, address, skynet.pack(...)) end
OK, so far, we can learn about cluster2 How does Lua initiate the request data and obtain the response results? It has also completed half of the remote call.
Next, look at cluster1 How does Lua forward the data to the corresponding service after receiving it, and how does the service reply to the message.
As mentioned before, cluster Open "DB" will eventually create a gate gateway to listen.
--gate.lua function handler.message(fd, msg, sz) -- recv a package, forward it local c = connection[fd] local agent = c.agent --Because before clusterd.lua Creating gate Service is not specified agent,So here agent yes nil if agent then skynet.redirect(agent, c.client, "client", 1, msg, sz) else skynet.send(watchdog, "lua", "socket", "data", fd, netpack.tostring(msg, sz)) --Forward to clusterd.lua of socket method end end --gateserver.lua local function dispatch_msg(fd, msg, sz) if connection[fd] then handler.message(fd, msg, sz) --Callback gate.lua of message method else skynet.error(string.format("Drop message from fd (%d) : %s", fd, netpack.tostring(msg,sz))) end end MSG.data = dispatch_msg --register socket news skynet.register_protocol { name = "socket", id = skynet.PTYPE_SOCKET, -- PTYPE_SOCKET = 6 unpack = function ( msg, sz ) return netpack.filter( queue, msg, sz) end, dispatch = function (_, _, q, type, ...) queue = q if type then MSG[type](...) --Set callback function end end } --clusterd.lua function command.listen(source, addr, port) local gate = skynet.newservice("gate") ... skynet.call(gate, "lua", "open", { address = addr, port = port }) end
Let's review cluster D How Lua creates the gate service. And how to receive messages sent remotely. Next, let's take a look at the message received by the gate, clusterd How does Lua handle it.
--clusterd.lua function command.socket(source, subcmd, fd, msg) if subcmd == "data" then local sz local addr, session, msg, padding, is_push = cluster.unpackrequest(msg) if padding then --(1) local requests = large_request[fd] if requests == nil then requests = {} large_request[fd] = requests end local req = requests[session] or { addr = addr , is_push = is_push } requests[session] = req table.insert(req, msg) return else local requests = large_request[fd] if requests then local req = requests[session] if req then requests[session] = nil table.insert(req, msg) msg,sz = cluster.concat(req) addr = req.addr is_push = req.is_push end end if not msg then local response = cluster.packresponse(session, false, "Invalid large req") socket.write(fd, response) return end end local ok, response if addr == 0 then local name = skynet.unpack(msg, sz) local addr = register_name[name] if addr then ok = true msg, sz = skynet.pack(addr) else ok = false msg = "name not found" end elseif is_push then --(2) skynet.rawsend(addr, "lua", msg, sz) return -- no response else --(3) ok , msg, sz = pcall(skynet.rawcall, addr, "lua", msg, sz) end if ok then response = cluster.packresponse(session, true, msg, sz) if type(response) == "table" then for _, v in ipairs(response) do socket.lwrite(fd, v) end else socket.write(fd, response) end else response = cluster.packresponse(session, false, msg) --according to session Return to the corresponding requestor socket.write(fd, response) end elseif subcmd == "open" then skynet.error(string.format("socket accept from %s", msg)) skynet.call(source, "lua", "accept", fd) else large_request[fd] = nil skynet.error(string.format("socket %s %d %s", subcmd, fd, msg or "")) end end
For convenience, if the padding is nil and the data packet does not exceed 32k, the code in process (1) will not be followed. If the request initiated by the opposite node is cluster In the send mode (push mode), follow the process (2). If it's a cluster Call mode (request response), follow the process (3).
For process (2), call {Skynet Rawend (addr, "Lua", MSG, SZ) is to send messages to the specified addr service. Addr can be a string or a number, but as we said before, addr is the "sdb" string. It does not need a response, so it is returned directly here, which is (3) the code in the previous line, {return -- no response.
For process (3), you will get a response message after calling {OK, msg, SZ = pcall (skynet.rawcall, addr, "Lua", msg, SZ). If the call is successful, the msg will be packaged and the header message will be added to pass through the {socket Write (FD, response) is sent back, which completes a remote procedure call.
For Skynet Rawend and Skynet Rawcall # doesn't know very well. You can have a look first skynet source code appreciation.
Now on the cluster Unpackrequest (MSG) to analyze and see how to unpack.
//lua-cluster.c static int unpackreq_string(lua_State *L, const uint8_t * buf, int sz) { if (sz < 2) { return luaL_error(L, "Invalid cluster message (size=%d)", sz); } size_t namesz = buf[1]; //Get service name length if (sz < namesz + 6) { return luaL_error(L, "Invalid cluster message (size=%d)", sz); } lua_pushlstring(L, (const char *)buf+2, namesz); //Return service name uint32_t session = unpack_uint32(buf + namesz + 2); lua_pushinteger(L, (uint32_t)session); //Return to session lua_pushlstring(L, (const char *)buf+2+namesz+4, sz - namesz - 6); //Return message content msg if (session == 0) { lua_pushnil(L); lua_pushboolean(L,1); // is_push, no reponse return 5; } return 3; } static int lunpackrequest(lua_State *L) { size_t ssz; const char *msg = luaL_checklstring(L,1,&ssz); int sz = (int)ssz; switch (msg[0]) { ... case '\x80': //The address is a string and the content does not exceed 32k return unpackreq_string(L, (const uint8_t *)msg, sz); ... } }
Compared with the previous # packreq_string(L, session, msg, sz, is_push); Corresponding. Get the service name length namesz first. Then get the service name through namesz, then the session, and finally the message body.
Let's look at the "cluster" Packresponse (session, false, msg) is how to package msg and add a header.
//lua-cluster.c static int lpackresponse(lua_State *L) { uint32_t session = (uint32_t)luaL_checkinteger(L,1); // clusterd.lua:command.socket call lpackresponse, // and the msg/sz is return by skynet.rawcall , so don't free(msg) int ok = lua_toboolean(L,2); void * msg; size_t sz; if (lua_type(L,3) == LUA_TSTRING) { // msg = (void *)lua_tolstring(L, 3, &sz); //msg points to the message body } ... //The next step is to package the header information uint8_t buf[TEMP_LENGTH]; fill_header(L, buf, sz+5); fill_uint32(buf+2, session); buf[6] = ok; memcpy(buf+7,msg,sz); lua_pushlstring(L, (const char *)buf, sz+7); return 1; }
Header information includes:
- 0 ~ 1 bytes: message length
- 2 ~ 5 bytes: session
- The 6th byte: status code
- From the 7th byte: msg message