How does Django use snowflake to generate a primary key instead of automatically?

Posted by ivi on Wed, 17 Jun 2020 08:11:54 +0200

Before, ID was implemented with auto increment. Now I want to use Snowflake algorithm to generate primary key. What changes should I make?

catalog

Background

The current project framework is as follows

  • Django
  • django.db.backends.postgresql_psycopg2

At present, the models declaration and save() method are as follows

# models.py
# Default primary key used
class User(models.Model):
    name = models.CharField(max_length=100, verbose_name="name")
# New operation of user, default primary key
user = User()
user.name = "123"
user.save()

stay user.save The SQL statement executed at () is

INSERT INTO "polls_user" ("name") VALUES ('xiaoming') RETURNING "polls_user"."id"; args=('xiaoming',)

Obviously, the ID primary key is not set in the statement

Check the table structure and find that id is automatically implemented by sequence auto increment

CREATE TABLE "public"."polls_user" (
  "id" int4 NOT NULL DEFAULT nextval('polls_user_id_seq'::regclass),
  "name" varchar(100) COLLATE "pg_catalog"."default" NOT NULL,
  CONSTRAINT "polls_user_pkey" PRIMARY KEY ("id")
)

So based on the current situation, how to override the use of snowflake algorithm to generate ID, rather than automatically generate primary key?

Implementation plan

Scenario 1 - add primary key manually

The implementation is as follows

  • (1) First you need to declare the id as a non AutoField type

    If not declared, when postgresql is used, the primary key will be set to serial type by default, and the sequence will be used to generate the primary key

# models.py

class User(models.Model):
    id = models.BigIntegerField(primary_key=True)
    name = models.CharField(max_length=100, verbose_name="name")
  • (2) Second, when save(), you need to set the primary key manually
# New operation of user

user = User()
user.id = snowflake.next_id()
user.name = "123"
# Force may not be specified_ Insert = true. If it is not specified, it will update by default to see if there are records. If not, insert will be executed
user.save(force_insert=True)

Changes: (1) add id settings to all model s

(2) Set ID before all save()= snowflake.next_ id()

Scenario 2 - override save() method

The implementation is as follows

  • Override the save() method of a model separately
# models.py

class User(models.Model):
    id = models.BigIntegerField(primary_key=True)
    name = models.CharField(max_length=100, verbose_name="name")
    
    def save(self, *args, **kwargs):
        if not self.id:
            self.id = snowflake.next_id()
        super(User, self).save(*args, **kwargs)
        return self.id
  • The calling place does not need to be modified or set user.id = snowflake.next_id()

According to the above method, each new model needs to override the save() method, obviously the workload is still large.

Change: (1) add id settings to all model s

(2) All model s override the save() method

Scenario 3 - using pre in Django Signals_ save()

Official document address: https://docs.djangoproject.com/en/3.0/topics/signals/

The implementation is as follows

  • modify models.py file

Using pre_save() method, called models.save Execute immediately after ()

be careful

(1) Like auth_ save() and other django s of the table at the beginning_ The first tables will follow this logic. You need to exclude the save of these model s

(2) save() includes update and insert, and filters the model of update

from django.db.models.signals import pre_save
from django.dispatch import receiver
import logging


@receiver(pre_save)
def pre_save_set_snowflake_id(sender, instance, *args, **kwargs):
    """
    Django Signals, pre_save
    //Applicable to all model s
    If we dont include the sender argument in the decorator,
    like @receiver(pre_save, sender=MyModel), the callback will be called for all models.
    """
    # print(__name__)  # = polls.models
    # print(type(instance)) # = <class 'polls.models.Question'>
    if __name__ in str(type(instance)) and not instance.id:
        # Meet the conditions (1) in this models.py The model whose model (2) id declared in is not empty will use snowflake to generate ID
        # The reason is that if you do not add conditions (1), it will be like auth_ Table at the beginning and django_ The starting table will also use the id generated by snowflake, but its id length is not enough
        instance.id = snowflake.next_id()

        
class User(models.Model):
    id = models.BigIntegerField(primary_key=True)
    name = models.CharField(max_length=100, verbose_name="name")
            

As above, revise it models.py After that, you can use save() safely, and the original business logic does not need to be modified

Change (1) add id settings to all models, and set id uniformly= models.BigIntegerField (primary_ key=True)

(2) Add pre_save() method, set ID uniformly= snowflake.next_ id()

Scenario 4 - Custom django.db.backends Or Field

I'm trying, but I haven't written it yet.

The idea is to write a new field, such as similar models.UUIDField , can you write a models.SnowflakeIDField

summary

Select scenario 3 above and use pre in Django Signals_ save()

Because the code is less intrusive, the existing business logic does not need to be modified, and sql parameter check can be added

Topics: Django SQL PostgreSQL less