Redis分布式锁

Redis大约 3446 字

单点Redis分布式锁

上锁

SET resource_name my_random_value NX PX 30000

解锁

if redis.call("get",KEYS[1]) == ARGV[1] then
    return redis.call("del",KEYS[1])
else
    return 0
end

集群Redis分布式锁

单节点Redis容易故障,生产环境一般是集群。

但是使用主从或集群存在问题:master拿到锁,但是加锁的key还没有同步到slave节点,master就故障了,发生故障转移,slave节点升级为master节点,导致锁丢失。

可使用Redlock实现。

Redlock

摘自:https://redis.io/topics/distlock#the-redlock-algorithm

The Redlock algorithm In the distributed version of the algorithm we assume we have N Redis masters. Those nodes are totally independent, so we don’t use replication or any other implicit coordination system. We already described how to acquire and release the lock safely in a single instance. We take for granted that the algorithm will use this method to acquire and release the lock in a single instance. In our examples we set N=5, which is a reasonable value, so we need to run 5 Redis masters on different computers or virtual machines in order to ensure that they’ll fail in a mostly independent way. In order to acquire the lock, the client performs the following operations:

  1. It gets the current time in milliseconds.
  2. It tries to acquire the lock in all the N instances sequentially, using the same key name and random value in all the instances. During step 2, when setting the lock in each instance, the client uses a timeout which is small compared to the total lock auto-release time in order to acquire it. For example if the auto-release time is 10 seconds, the timeout could be in the ~ 5-50 milliseconds range. This prevents the client from remaining blocked for a long time trying to talk with a Redis node which is down: if an instance is not available, we should try to talk with the next instance ASAP.
  3. The client computes how much time elapsed in order to acquire the lock, by subtracting from the current time the timestamp obtained in step 1. If and only if the client was able to acquire the lock in the majority of the instances (at least 3), and the total time elapsed to acquire the lock is less than lock validity time, the lock is considered to be acquired.
  4. If the lock was acquired, its validity time is considered to be the initial validity time minus the time elapsed, as computed in step 3.
  5. If the client failed to acquire the lock for some reason (either it was not able to lock N/2+1 instances or the validity time is negative), it will try to unlock all the instances (even the instances it believed it was not able to lock).

谷歌翻译:
在算法的分布式版本中,我们假设我们有N个Redis master节点。这些节点是完全互相独立的,因此我们不使用主从复制或任何其他隐式协调系统(集群等)。我们已经描述了如何在单个实例中安全地获取和释放锁。我们认为该算法将使用此方法在单个实例中获取和释放锁,这是理所当然的。在我们的示例中,我们将N = 5设置为一个合理的值,因此我们需要在不同的计算机或虚拟机上运行5个Redis master节点,以确保它们不会同时都宕机。

为了获取锁,客户端执行以下操作:

  1. 以毫秒为单位获取当前时间。
  2. 尝试在所有N个实例中顺序使用所有实例中相同的键名和随机值来获取锁定。当向Redis请求获取锁时,客户端应该设置一个网络连接和响应超时时间,超时时间小于锁的过期时间,以便获取该超时时间。例如,如果锁的过期时间为10秒,则超时时间可能在5到50毫秒之间。这样可以防止客户端长时间与处于故障状态的Redis节点进行通信:如果某个实例不可用,我们应该尝试与下一个实例尽快进行通信。
  3. 客户端通过从当前时间中减去在步骤1中获得的时间戳,来计算获取锁所花费的时间。当且仅当客户端能够在大多数实例(至少3个)中获取锁时,并且获取锁所花费的总时间小于锁的过期时间,则认为已获取锁。
  4. 如果获取了锁,则将其有效时间视为初始有效时间减去经过的时间,如步骤3中所计算。
  5. 如果客户端由于某种原因(没有在至少N/2+1个实例取到锁或取锁时间已经超过了有效时间)而未能获得该锁,客户端应该在所有的Redis实例上进行解锁(即便某些Redis实例根本就没有加锁成功)。

redisson实现了Redlock。

Redlock可能失效的原因

  1. 时钟发生跳跃;
  2. 长时间的GC pause;
  3. 长时间的网络延迟。

选择

  • 一般生产环境是Redis Cluster,Redlock的实现比较浪费资源,至少3个互相独立的主节点部署在不同的服务器;
  • Zookeeper分布式锁也不是100%可靠;
  • 在能够接受一定可靠性的情况下,可选择set nx pxset nx ex

参考

https://redis.io/topics/distlock
https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html
http://antirez.com/news/101

阅读 493 · 发布于 2019-10-31

————        END        ————

扫描下方二维码关注公众号和小程序↓↓↓

昵称:
随便看看换一批