Distributed snowflake ID generation GOlang language

Posted by sphinx9999 on Sat, 15 Jan 2022 17:59:44 +0100

I don't know who said: there are no two identical snowflakes in the world, but they will have the same ID

In the distributed system, a business system will be deployed to multiple servers, and users will randomly access one of them. The reason why the distributed system is introduced is to enable the whole system to carry more visits. Such as the order number, we need it to be globally unique, and we basically use it as query criteria; For the sake of system security, we should not let others guess our order number easily, but also prevent the company's competitors from directly guessing the company's business volume through the order number; In order to ensure the rapid response of the system, the generation algorithm can not be too time-consuming. The snowflake algorithm just solves these problems.

SnowFlake algorithm (SnowFlake algorithm) is an open-source distributed id generation algorithm for Twitter. The core idea is to use a 64 bit long number as the globally unique id. Its structure is as follows:

Let's further analyze each part:

Symbol identification bit (1 bit): in order to distinguish between negative number (1) and positive number (0), the designer takes the first bit as the symbol bit, and the ID usually uses a positive number, so the highest bit is fixed to 0;
41 bit time cut (MS), which is the value obtained by subtracting the start time from the current time; Therefore, once our algorithm is put into use, the start time set in the program cannot be changed at will, otherwise duplicate id values may appear;
Since it is implemented based on time and has only 41 bits, it can be calculated that the algorithm can only be used for about 70 years: (2 ^ 41) / (1000 * 60 * 60 * 24 * 365) = 69.7 years;
10 bit machine ID: a total of 1024 nodes, which are usually divided into two parts: machine room ID (datacenter ID) and machine ID (workerId);
12 bit serial number: counting in milliseconds, 4098 in total; In short, it is calculated from 0 every millisecond;

The final SnowFlake algorithm is summarized as follows: on the whole, it is sorted according to the time increment, and there is no ID collision in the whole distributed system (distinguished by machine room ID and machine ID), and the efficiency is high. Up to 1024 machines are supported. Each machine can generate up to 4096 IDS per millisecond. Theoretically, the whole cluster can generate 1024 * 1000 * 4096 = 4.2 billion IDS per second.

package main

import (
	"errors"
	"fmt"
	"sync"
	"time"
)

type SnowFlakeIdWorker struct {

	// Start timestamp
	twepoch int64

	// Number of digits occupied by machine ID
	workerIdBits int64

	// Number of digits occupied by data ID
	dataCenterIdBits int64

	// Maximum machine ID supported
	maxWorkerId int64

	// Maximum machine room ID supported
	maxDataCenterId int64

	// The number of bits the sequence occupies in the ID
	sequenceBits int64

	// Machine ID shift left
	workerIdShift int64

	// Shift the machine room ID to the left
	dataCenterIdShift int64

	// Shift the time section to the left
	timestampLeftShift int64

	// Maximum mask value for the generated sequence
	sequenceMask int64

	// Work machine ID
	workerId int64

	// Machine room ID
	dataCenterId int64

	/**
	 * Millisecond sequence
	 */
	sequence int64

	// Timestamp of last generated ID
	lastTimestamp int64

	// lock
	lock sync.Mutex
}

func (p *SnowFlakeIdWorker) init(dataCenterId int64, workerId int64) {
	// Start timestamp; This is June 1, 2021
	p.twepoch = 1622476800000
	// Number of digits occupied by machine ID
	p.workerIdBits = 5
	// Number of digits occupied by data ID
	p.dataCenterIdBits = 5
	// The maximum machine ID supported is 31
	p.maxWorkerId = -1^(-1 << p.workerIdBits)
	// The maximum supported machine room ID is 31
	p.maxDataCenterId = -1^(-1 << p.dataCenterIdBits)
	// The number of bits the sequence occupies in the ID
	p.sequenceBits = 12
	// The machine ID shifts 12 bits to the left
	p.workerIdShift = p.sequenceBits
	// The machine room ID moves 17 bits to the left
	p.dataCenterIdShift = p.sequenceBits+p.workerIdBits
	// The time cut shifts 22 bits to the left
	p.timestampLeftShift = p.sequenceBits+p.workerIdBits+p.dataCenterIdBits
	// The maximum mask value of the generated sequence is 4095
	p.sequenceMask = -1^(-1 << p.sequenceBits)

	if workerId > p.maxWorkerId || workerId < 0 {
	panic(errors.New(fmt.Sprintf("Worker ID can't be greater than %d or less than 0", p.maxWorkerId)))
	}
	if dataCenterId > p.maxDataCenterId || dataCenterId < 0 {
	panic(errors.New(fmt.Sprintf("DataCenter ID can't be greater than %d or less than 0", p.maxDataCenterId)))
	}

	p.workerId = workerId
	p.dataCenterId = dataCenterId
	// Sequence in milliseconds (0 ~ 4095)
	p.sequence = 0
	// Timestamp of last generated ID
	p.lastTimestamp = -1
}

// Generate ID. note that this method ensures thread safety by locking
func (p *SnowFlakeIdWorker) nextId() int64 {
	p.lock.Lock()
	defer p.lock.Unlock()

	timestamp := p.timeGen()
	// If the current time is less than the timestamp generated by the last ID, it indicates that a clock callback occurs. In order to ensure that the ID does not repeat, an exception is thrown.
	if timestamp < p.lastTimestamp {
	panic(errors.New(fmt.Sprintf("Clock moved backwards. Refusing to generate id for %d milliseconds", p.lastTimestamp - timestamp)))
	}

	if p.lastTimestamp == timestamp {
	// If it is generated at the same time, the sequence number + 1
	p.sequence = (p.sequence + 1) & p.sequenceMask
	// Sequence overflow in milliseconds: maximum exceeded
	if p.sequence == 0 {
	// Block to the next millisecond and get a new timestamp
	timestamp = p.tilNextMillis(p.lastTimestamp)
	}
	} else {
	// Timestamp change, sequence reset
	p.sequence = 0
	}
	// Save this time stamp
	p.lastTimestamp = timestamp

	// Shift and put together by or operation
	return ((timestamp - p.twepoch) << p.timestampLeftShift) |
	(p.dataCenterId << p.dataCenterIdShift) |
	(p.workerId << p.workerIdShift) | p.sequence
}

func (p *SnowFlakeIdWorker) tilNextMillis(lastTimestamp int64) int64 {
	timestamp := p.timeGen()
	for ;timestamp <= lastTimestamp; {
	timestamp = p.timeGen()
	}
	return timestamp
}

func (p *SnowFlakeIdWorker) timeGen() int64 {
	return time.Now().UnixNano()/1e6
}

func main()  {
	idWorker := &SnowFlakeIdWorker{}
	idWorker.init(0, 2)//Machine room and machine ID cannot exceed 31
	fmt.Println(idWorker.nextId())
}

Topics: Go Distribution

Programmer Think

Distributed snowflake ID generation GOlang language

Hot Topics