解决python操作redis cluster集群时遇到的问题

今天在测试redis-py-cluster的时候，遇到一个奇怪的问题… 一开始以为是python的redis cluster遇到的bug … … 在作者的issue里也看到了别人也同样遇到我这样的问题… … 提示的错误是这样的…

root@ubuntu:~# python test.py
[{'host': '127.0.0.1', 'port': '7000'}]
Traceback (most recent call last):
  File "test.py", line 24, in <module>
    main()
  File "test.py", line 21, in main
    r.set('nima', 'a')
  File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 1055, in set
    return self.execute_command('SET', *pieces)
  File "build/bdist.linux-x86_64/egg/rediscluster/utils.py", line 82, in inner
  File "build/bdist.linux-x86_64/egg/rediscluster/client.py", line 308, in execute_command
rediscluster.exceptions.RedisClusterException: Too many Cluster redirections

这个是我测试的代码, 代码的逻辑本身是没有问题的，但是奇怪的是会遇到 rediscluster.exceptions.RedisClusterException: Too many Cluster redirections 的问题…

# -*- coding: utf-8 -*
import os,sys,time,traceback

import redis
from rediscluster import StrictRedisCluster

def main():
    serverip='127.0.0.1'
    #startup_nodes=[{"host": serverip,"port": str(i)} for i in xrange(7000, 7007)]
    startup_nodes=[{"host": serverip,"port": i} for i in xrange(7000, 7007)]
    print startup_nodes
    try:
        r = StrictRedisCluster(startup_nodes=startup_nodes)
    except Exception, err:
        print err
        print 'failed to connect cluster'
        sys.exit(0)

    #for i in xrange(1000):
    r.set('nima', 'a')

if __name__=='__main__':
    main()

既然代码没有问题，那应该是redis cluster出现了问题…

root@ubuntu:~# redis-cli -c -p 7000
127.0.0.1:7000> set nima nia
-> Redirected to slot [16259] located at :0
Could not connect to Redis at :0: Name or service not known
Could not connect to Redis at :0: Name or service not known
not connected>

使用 /redis-3.0.1/src/redis-trib.rb check 127.0.0.1:7000 的时候，发现以前是4个master，结果现在成三个master了，还有就是多个slave节点被剔除这时候把这些down的reids都启动就OK了…

root@ubuntu:~# ./redis-3.0.1/src/redis-trib.rb check 127.0.0.1:7000|more
Connecting to node 127.0.0.1:7000: OK
Connecting to node 127.0.0.1:7007: OK
Connecting to node 127.0.0.1:7004: OK
Connecting to node 127.0.0.1:7003: OK
Connecting to node 127.0.0.1:7002: OK
Connecting to node 127.0.0.1:7006: OK
Connecting to node 127.0.0.1:7001: OK
Connecting to node 127.0.0.1:7005: OK
>>> Performing Cluster Check (using node 127.0.0.1:7000)
M: b4f0e1fde9abbcef6cfaea86232afb07cc19eb77 127.0.0.1:7000
   slots:0-4095 (4096 slots) master
   1 additional replica(s)
S: d75e766de5f82734ffb3ff8ef944d353b78db740 127.0.0.1:7007
   slots: (0 slots) slave
   replicates 82b7b92878dc1af4a6f9a0e7f20535f86771a61f
S: 95272ed07fa70058c8167a93936f6e7093152a90 127.0.0.1:7004
   slots: (0 slots) slave
   replicates b4f0e1fde9abbcef6cfaea86232afb07cc19eb77
M: 82b7b92878dc1af4a6f9a0e7f20535f86771a61f 127.0.0.1:7003
   slots:12288-16383 (4096 slots) master
   1 additional replica(s)
M: ee2e48c90a4ac6fa9128c4f06600234eb9a05d4b 127.0.0.1:7002
   slots:8192-12287 (4096 slots) master
   1 additional replica(s)
S: e83464a6948147193be25cf8bb8aea66ce3616a6 127.0.0.1:7006
   slots: (0 slots) slave
   replicates ee2e48c90a4ac6fa9128c4f06600234eb9a05d4b
M: 5bd0681b67d775ec0a67b84481c5bbe802216f49 127.0.0.1:7001
   slots:4096-8191 (4096 slots) master
   1 additional replica(s)
S: 39ef91564db8a85ca4caa03f65b03ed24cc79cd7 127.0.0.1:7005
   slots: (0 slots) slave
   replicates 5bd0681b67d775ec0a67b84481c5bbe802216f49
[OK] All nodes agree about slots configuration.
... Check for open slots...
... Check slots coverage...
[OK] All 16384 slots covered.

因为redis的日志配置有问题，所有刚才的问题的原因找不到了… 这样咱们特意的干掉一组redis主从… 貌似根前面的问题不太一样.. 先前是 Too many Cluster redirections ，但是cluster info的状态是OK的…. 这次是 cluster info > cluster_state:fail

>>> Check for open slots…
>>> Check slots coverage…
[ERR] Not all 16384 slots are covered by nodes.
oot@ubuntu:~# redis-cli -c -p 7000
127.0.0.1:7000> set a a
(error) CLUSTERDOWN The cluster is down

我们可以用redis-trib.rb fix 来修复集群…. …. /redis-3.0.1/src/redis-trib.rb fix 127.0.0.1:7000 ，如果还是启动不了的话，可以把相关的cluster-config-file节点同步信息删掉。

另外这里转载下redis cluster的命令集，操作redis集群是个很蛋疼的事情，大家可以用下面的命令多尝试下..

集群
CLUSTER INFO 打印集群的信息 ,可以知道集群是否好坏
CLUSTER NODES 列出集群当前已知的所有节点（node），以及这些节点的相关信息。
节点
CLUSTER MEET <ip> <port> 将 ip 和 port 所指定的节点添加到集群当中，让它成为集群的一份子。
CLUSTER FORGET <node_id> 从集群中移除 node_id 指定的节点。
CLUSTER REPLICATE <node_id> 将当前节点设置为 node_id 指定的节点的从节点。
CLUSTER SAVECONFIG 将节点的配置文件保存到硬盘里面。
槽(slot)
CLUSTER ADDSLOTS <slot> [slot …] 将一个或多个槽（slot）指派（assign）给当前节点。
CLUSTER DELSLOTS <slot> [slot …] 移除一个或多个槽对当前节点的指派。
CLUSTER FLUSHSLOTS 移除指派给当前节点的所有槽，让当前节点变成一个没有指派任何槽的节点。
CLUSTER SETSLOT <slot> NODE <node_id> 将槽 slot 指派给 node_id 指定的节点，如果槽已经指派给另一个节点，那么先让另一个节点删除该槽>，然后再进行指派。
CLUSTER SETSLOT <slot> MIGRATING <node_id> 将本节点的槽 slot 迁移到 node_id 指定的节点中。
CLUSTER SETSLOT <slot> IMPORTING <node_id> 从 node_id 指定的节点中导入槽 slot 到本节点。
CLUSTER SETSLOT <slot> STABLE 取消对槽 slot 的导入（import）或者迁移（migrate）。
键
CLUSTER KEYSLOT <key> 计算键 key 应该被放置在哪个槽上。
CLUSTER COUNTKEYSINSLOT <slot> 返回槽 slot 目前包含的键值对数量。
CLUSTER GETKEYSINSLOT <slot> <count> 返回 count 个 slot 槽中的键。

看来redis cluster一定要多尝试…. 别到了线上后，就傻逼了…

大家觉得文章对你有些作用！如果想赏钱，可以用微信扫描下面的二维码，感谢!
另外再次标注博客原地址 xiaorui.cc

发表评论 取消回复

发表评论取消回复