Distributed service data consistency - redis

Posted by jackel15 on Fri, 11 Feb 2022 23:18:28 +0100

Go redis distributed lock: github.com/go-redsync/redsync

client := goredislib.NewClient(&goredislib.Options{
   Addr: "10.211.55.6:6379",
})
pool := goredis.NewPool(client) // or, pool := redigo.NewPool(...)
rs := redsync.New(pool)

//Set the lock name as required
mutexName := "goods-1"
var wg sync.WaitGroup
wg.Add(20)
//Set the lock as required
mutexName := "goods-1"
var wg sync.WaitGroup
wg.Add(20)
for i := 0; i < 20; i++ {
   go func() {
      defer wg.Done()
      var goods Goods
      mutex := rs.NewMutex(mutexName)
      fmt.Println("Start acquiring lock")
      if err := mutex.Lock(); err != nil {
         fmt.Println("Get lock exception")
         panic(err)
      }
      fmt.Println("Lock acquisition succeeded")
      db.Where(Goods{ProductId: 1}).First(&goods)
      result := db.Model(&Goods{}).Where("product_id=?", 1).Updates(Goods{Inventory: goods.Inventory - 1})
      if result.RowsAffected == 0 {
         fmt.Println("Update failed")
      }
      fmt.Println("Start releasing lock")
      if ok, err := mutex.Unlock(); !ok || err != nil {
         panic("unlock failed")
      }
      fmt.Println("Lock released successfully")
   }()
}
wg.Wait()

Interpretation of redsync source code

Use Redis Setnx command:

When the specified key does not exist, set the specified value for the key. Set successfully, return 1. Setting failed, return 0. (make getting and setting values atomic)

redis 127.0.0.1:6379> SETNX KEY_NAME VALUE

Set expiration time:

Avoid the execution process service hanging up, lock release failure and deadlock.

func (m *Mutex) acquire(ctx context.Context, pool redis.Pool, value string) (bool, error) {
   conn, err := pool.Get(ctx)
   if err != nil {
      return false, err
   }
   defer conn.Close()
   reply, err := conn.SetNX(m.name, value, m.expiry)  //The expiration time is 8 seconds
   if err != nil {
      return false, err
   }
   return reply, nil
}

Is the time expired before the business is completed?

1. Refresh the expiration time before expiration
2. You need to start the cooperation process to complete the delay work, so as to avoid that the service hung has been applying for an extension of time, resulting in other services unable to get the lock

var touchScript = redis.NewScript(1, `
   if redis.call("GET", KEYS[1]) == ARGV[1] then
      return redis.call("PEXPIRE", KEYS[1], ARGV[2])
   else
      return 0
   end
`)

Problems to be solved for distributed locks: – lua script implementation

1. Mutex - setnx
2. Avoid deadlock - expiration time
3. Security - the lock can only be deleted by the holder, not by other users. Judge by the value value. Only the current g knows the value. Take out the value for comparison when deleting.

Redlock algorithm

In the distributed version of the algorithm, we assume that we have N Redis master nodes, which are completely independent. We don't need any replication or other implicit distributed coordination algorithms. We have described how to acquire and release locks safely in a single node environment. Therefore, we should naturally use this method to obtain and release locks in each single node. In our example, we set N to 5, which is a relatively reasonable value. Therefore, we need to run five master nodes on different computers or virtual machines to ensure that they will not go down at the same time in most cases. The client needs to perform the following operations to obtain the lock:
1. Get the current time (in milliseconds).
2. Request locks on N nodes with the same key and random value in turn. In this step, when the client requests locks on each master, there will be a much smaller timeout than the total lock release time. For example, if the automatic lock release time is 10 seconds, the timeout time of lock request of each node may be in the range of 5-50 milliseconds, which can prevent a client from blocking a failed master node for a long time. If a master node is unavailable, we should try the next master node as soon as possible.
3. The client calculates the time spent in acquiring the lock in the second step. Only when the client successfully acquires the lock on most master nodes (three in this case) and the total time consumed does not exceed the lock release time, the lock is considered to have been successfully acquired.
4. If the lock is obtained successfully, the automatic lock release time now is the initial lock release time minus the time consumed to obtain the lock before.
5. If the lock acquisition fails, the client will release the lock on each master node, even those locks that he thinks have not been successfully acquired, whether it is because less than half of the successfully acquired locks (N/2+1) or because the total time consumed exceeds the lock release time.

Topics: Go