Primary key generation strategy

Posted by Ehailey on Fri, 11 Feb 2022 04:04:00 +0100

Primary key generation strategy

1. What is a primary key?

Primary key, that is, primary key, is the candidate key selected as the unique identification of the row of the table.

A table has only one primary key. A primary key can also be called a primary key. A primary key can consist of one field or multiple fields, which are called single field primary key or multi field primary key respectively. Also known as master code.

2. Necessity of generating primary key?

The primary key can be used to represent a specific row that is precisely located. If there is no primary key, you cannot accurately locate whether a record is the relevant row record you want, which will make it difficult to update or delete a specific row in the table.

3. Generate primary key policy?

  1. Auto increment
  2. UUID
  3. Redis
  4. SnowFlake
  5. Generate unique ID with zookeeper

4. Auto increment

Database self growth sequence or field.

4.1 implementation method

Add the auto increment statement when creating the table.

4.2 advantages and disadvantages

Advantages: simple implementation.

Disadvantages: it is cumbersome to divide databases and tables. When creating a new table, you need to know the id of the last primary key of the previous table, and then increase it automatically.

5.UUID

UUID is a universal unique identifier, which can generate a global unique identifier with a length of 32 bits.

5.1 implementation method

It can be generated by database or program.

String uuid = UUID.randomUUID().toString();

5.2 advantages and disadvantages

Advantages: simple implementation and convenient code

Disadvantages: 1 Without sorting, the trend growth cannot be guaranteed.

2. 32 bits are too long and the warehousing performance is poor (involving the splitting of B + index tree)

6.Redis

It mainly depends on that Redis is single threaded, so it can also be used to generate globally unique ID s. It can be realized by Redis's atomic operations INCR and INCRBY.

6.1 implementation method

Set the initial value of step size and Redis

6.2 advantages and disadvantages

Advantages: 1 Independent of database

​ 2. Digital ID natural sorting is helpful for paging or sorting results.

Disadvantages: heavy workload of coding and configuration

7.SnowFlake

snowflake is Twitter's open source distributed ID generation algorithm, and the result is a long ID.

7.1 implementation method

41bit is used as the number of milliseconds, 10bit is used as the ID of the machine (5 bits are the data center and 5 bits are the machine ID), 12bit is used as the serial number within milliseconds (meaning that each node can generate 4096 IDS per millisecond), and finally there is a symbol bit, which is always 0.

public class IdWorker{

    //Each of the following two 5 digits adds up to a 10 digit working machine id
    private long workerId;    //Job id
    private long datacenterId;   //Data id
    //12 digit serial number
    private long sequence;

    public IdWorker(long workerId, long datacenterId, long sequence){
        // sanity check for workerId
        if (workerId > maxWorkerId || workerId < 0) {
            throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0",maxWorkerId));
        }
        if (datacenterId > maxDatacenterId || datacenterId < 0) {
            throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0",maxDatacenterId));
        }
        System.out.printf("worker starting. timestamp left shift %d, datacenter id bits %d, worker id bits %d, sequence bits %d, workerid %d",
                timestampLeftShift, datacenterIdBits, workerIdBits, sequenceBits, workerId);

        this.workerId = workerId;
        this.datacenterId = datacenterId;
        this.sequence = sequence;
    }

    //Initial timestamp
    private long twepoch = 1288834974657L;

    //The length is 5 digits
    private long workerIdBits = 5L;
    private long datacenterIdBits = 5L;
    //Maximum
    private long maxWorkerId = -1L ^ (-1L << workerIdBits);
    private long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);
    //Serial number id length
    private long sequenceBits = 12L;
    //Maximum serial number
    private long sequenceMask = -1L ^ (-1L << sequenceBits);
    
    //The number of bits of work id that need to be shifted to the left, 12 bits
    private long workerIdShift = sequenceBits;
   //The data id needs to be shifted left by 12 + 5 = 17 bits
    private long datacenterIdShift = sequenceBits + workerIdBits;
    //The time stamp needs to be shifted left by 12 + 5 + 5 = 22 bits
    private long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
    
    //Last timestamp, initial value is negative
    private long lastTimestamp = -1L;

    public long getWorkerId(){
        return workerId;
    }

    public long getDatacenterId(){
        return datacenterId;
    }

    public long getTimestamp(){
        return System.currentTimeMillis();
    }

     //Next ID generation algorithm
    public synchronized long nextId() {
        long timestamp = timeGen();

        //Get the current timestamp. If it is less than the last timestamp, it indicates that the timestamp acquisition is abnormal
        if (timestamp < lastTimestamp) {
            System.err.printf("clock is moving backwards.  Rejecting requests until %d.", lastTimestamp);
            throw new RuntimeException(String.format("Clock moved backwards.  Refusing to generate id for %d milliseconds",
                    lastTimestamp - timestamp));
        }

        //Get the current timestamp. If it is equal to the last timestamp (within the same millisecond), add one to the serial number; Otherwise, the serial number is assigned to 0, starting from 0.
        if (lastTimestamp == timestamp) {
            sequence = (sequence + 1) & sequenceMask;
            if (sequence == 0) {
                timestamp = tilNextMillis(lastTimestamp);
            }
        } else {
            sequence = 0;
        }
        
        //Refresh the last timestamp value
        lastTimestamp = timestamp;

        /**
          * Return result:
          * (timestamp - twepoch) << timestampLeftShift) Indicates that the initial timestamp is subtracted from the timestamp, and then the corresponding digit is shifted to the left
          * (datacenterId << datacenterIdShift) Indicates that the data id is shifted to the left by the corresponding digit
          * (workerId << workerIdShift) Indicates that the work id is shifted to the left by the corresponding digit
          * | It is a bitwise OR operator, such as x | y. The result is 0 only when x and y are both 0, and 1 in other cases.
          * Because only the value in the corresponding bit is meaningful in one part and 0 in other bits, the final spliced id can be obtained by | operation on the value of each part
        */
        return ((timestamp - twepoch) << timestampLeftShift) |
                (datacenterId << datacenterIdShift) |
                (workerId << workerIdShift) |
                sequence;
    }

    //Get the timestamp and compare it with the last timestamp
    private long tilNextMillis(long lastTimestamp) {
        long timestamp = timeGen();
        while (timestamp <= lastTimestamp) {
            timestamp = timeGen();
        }
        return timestamp;
    }

    //Get system timestamp
    private long timeGen(){
        return System.currentTimeMillis();
    }

    //---------------Testing---------------
    public static void main(String[] args) {
        IdWorker worker = new IdWorker(1,1,1);
        for (int i = 0; i < 30; i++) {
            System.out.println(worker.nextId());
        }
    }

}

7.2 advantages and disadvantages

Advantages: it does not depend on the database, and its performance is better than that of the database.

Disadvantages: it is incremental on a single machine, but due to the distributed environment, the clock on each machine cannot be fully synchronized. The problem of time callback should be solved in the algorithm.

8. Use zookeeper to generate unique ID

zookeeper mainly generates serial numbers through its znode data version. It can generate 32-bit and 64 bit data version numbers. The client can use this version number as the unique serial number.

Topics: Database MySQL Distribution