logstash连接elasticserach时大范围连接关闭和堵塞的问题

今天一上班朱伟就告诉我,有大量的邮件报警,是关于logstash redis队列的堆积,数目有些大,已经积攒了100w了。  很是晕头,上次其实已经遇到过这样的问题,当时因为是做了redis升级调整,以为是这个引起的,所以重启了logstash server端解决了。  后来又发生了这样的情况,也就是 logstash不工作的情况。   今天就把这问题给排查下。

[ruifengyun@bj-log-1 ~]redis-cli -c llen key_count
(integer) 2250942
[ruifengyun@bj-log-1 ~] redis-cli -c llen key_count
(integer) 2250983
[ruifengyun@bj-log-1 ~]redis-cli -c llen key_count
(integer) 2251359
[ruifengyun@bj-log-1 ~] redis-cli -c llen key_count
(integer) 2251281
[ruifengyun@bj-log-1 ~]$ redis-cli -c llen key_count
(integer) 2251312

队列的数目一直在增长,但是logstash的进程还是存在的。 

ruifengyun@bj-log-1 ~]ps uax|grep logstash|grep -v grep
503       5905 53.3  0.9 4472976 935488 pts/1  Sl   Apr24 3606:37 /usr/java/jdk1.8.0_25/bin/java -Xmx500m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -jar /opt/logstash-1.4.2/vendor/jar/jruby-complete-1.7.11.jar -I/opt/logstash-1.4.2/lib /opt/logstash-1.4.2/lib/logstash/runner.rb agent -f /opt/logstash-1.4.2/logstash.conf
503       6020 51.1  0.9 4276360 931504 pts/1  Sl   Apr24 3458:11 /usr/java/jdk1.8.0_25/bin/java -Xmx500m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -jar /opt/logstash-1.4.2/vendor/jar/jruby-complete-1.7.11.jar -I/opt/logstash-1.4.2/lib /opt/logstash-1.4.2/lib/logstash/runner.rb agent -f /opt/logstash-1.4.2/logstash.conf
[ruifengyun@bj-log-1 ~]

看下logstash的进程的状态, 用strace追踪下进程的函数调用。 

[ruifengyun@bj-log-1 ~]$ sudo strace -p 5905
Process 5905 attached - interrupt to quit
futex(0x7fe1219599d0, FUTEX_WAIT, 5914, NULL


用lsof看到了大量elasticsearch的CLOSE_WAIT的状态,看了下系统的sysctl.conf的配置,对于tcp’的调优已经是配置过了。 但是问题依旧

 
java    5905 ruifengyun 3594u  IPv6         2236112621      0t0        TCP 192.168.1.50:40662->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3595u  IPv6         2236150475      0t0        TCP 192.168.1.50:40667->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3596u  IPv6         2236192556      0t0        TCP 192.168.1.50:40673->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3597u  IPv6         2236236259      0t0        TCP 192.168.1.50:40680->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3598u  IPv6         2236277898      0t0        TCP 192.168.1.50:40685->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3599u  IPv6         2236314998      0t0        TCP 192.168.1.50:40690->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3600u  IPv6         2236355853      0t0        TCP 192.168.1.50:40698->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3601u  IPv6         2236394084      0t0        TCP 192.168.1.50:40702->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3602u  IPv6         2236439308      0t0        TCP 192.168.1.50:40710->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3603u  IPv6         2236481496      0t0        TCP 192.168.1.50:40717->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3604u  IPv6         2236520014      0t0        TCP 192.168.1.50:40722->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3605u  IPv6         2236564971      0t0        TCP 192.168.1.50:40728->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3606u  IPv6         2236585984      0t0        TCP 192.168.1.50:40735->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3607u  IPv6         2236604549      0t0        TCP 192.168.1.50:40743->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3608u  IPv6         2236642216      0t0        TCP 192.168.1.50:40759->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3609u  IPv6         2236681436      0t0        TCP 192.168.1.50:40772->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3610u  IPv6         2236723744      0t0        TCP 192.168.1.50:40789->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3611u  IPv6         2236623466      0t0        TCP 192.168.1.50:40751->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3612u  IPv6         2236742961      0t0        TCP 192.168.1.50:40797->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3613u  IPv6         2236662022      0t0        TCP 192.168.1.50:40764->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3614u  IPv6         2236761054      0t0        TCP 192.168.1.50:40805->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3615u  IPv6         2236702037      0t0        TCP 192.168.1.50:40780->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3616u  IPv6         2236779867      0t0        TCP 192.168.1.50:40810->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3617u  IPv6         2236798527      0t0        TCP 192.168.1.50:40818->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3618u  IPv6         2236818794      0t0        TCP 192.168.1.50:40826->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3619u  IPv6         2236840105      0t0        TCP 192.168.1.50:40835->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3620u  IPv6         2236861229      0t0        TCP 192.168.1.50:csccfirewall->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3621u  IPv6         2236881917      0t0        TCP 192.168.1.50:40852->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3622u  IPv6         2236938623      0t0        TCP 192.168.1.50:40873->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3623u  IPv6         2236955560      0t0        TCP 192.168.1.50:40881->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3624u  IPv6         2236902415      0t0        TCP 192.168.1.50:40860->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3625u  IPv6         2236973880      0t0        TCP 192.168.1.50:40890->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3626u  IPv6         2236921262      0t0        TCP 192.168.1.50:40865->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3627u  IPv6         2237010942      0t0        TCP 192.168.1.50:40906->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3628u  IPv6         2237030179      0t0        TCP 192.168.1.50:40915->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3629u  IPv6         2236992291      0t0        TCP 192.168.1.50:40898->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3630u  IPv6         2237049196      0t0        TCP 192.168.1.50:40920->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3631u  IPv6         2237066370      0t0        TCP 192.168.1.50:40929->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3632u  IPv6         2237101082      0t0        TCP 192.168.1.50:40946->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3633u  IPv6         2237138192      0t0        TCP 192.168.1.50:40962->192.168.1.103:cslistener (ESTABLISHED)
java    5905 ruifengyun 3634r  FIFO                0,8      0t0 2237149520 pipe
java    5905 ruifengyun 3635u  IPv6         2237085095      0t0        TCP 192.168.1.50:40937->192.168.1.103:cslistener (CLOSE_WAIT)
java    5905 ruifengyun 3636w  FIFO                0,8      0t0 2237149520 pipe
java    5905 ruifengyun 3637u  IPv6         2237119814      0t0        TCP 192.168.1.50:40954->192.168.1.103:cslistener (CLOSE_WAIT)

后来在nginx端做了keepalived保持,对于CLOSE_wait的效果还是有些提升的。  但还是会出现这样的情况,甚是蛋疼 ! 


大家觉得文章对你有些作用! 如果想赏钱,可以用微信扫描下面的二维码,感谢!
另外再次标注博客原地址  xiaorui.cc

3 Responses

  1. 不明真相的苦逼运维 2015年11月2日 / 下午6:11

    试下scribe呢?
    做个文件buffer

  2. 王超 2015年4月30日 / 上午7:05

    好吧

发表评论

邮箱地址不会被公开。 必填项已用*标注