摘要:异常与默认值为默认值为秒。实验请求里头的会发起一个,请求请求一次对逐个请求,都失败,则的返回,对返回的取决于脚本再请求一次该下面的都挂的情况下出现中健康检查机制深入分析容错机制原创胡志广线上的一次分析
异常
max_fails与fail_timeoutupstream server temporarily disabled while connecting to upstream
no live upstreams while connecting to upstream
max_fails默认值为1,fail_timeout默认值为10秒。
nginx可以通过设置max_fails(最大尝试失败次数)和fail_timeout(失效时间,在到达最大尝试失败次数后,在fail_timeout的时间范围内节点被置为失效,除非所有节点都失效,否则该时间内,节点不进行恢复)对节点失败的尝试次数和失效时间进行设置,当超过最大尝试次数或失效时间未超过配置失效时间,则nginx会对节点状会置为失效状态,nginx不对该后端进行连接,直到超过失效时间或者所有节点都失效后,该节点重新置为有效,重新探测.
upstream backend { server backend1.example.com weight=5; server 127.0.0.1:8080 max_fails=3 fail_timeout=30s; server unix:/tmp/backend3; server backup1.example.com backup; }fail的标准
比如
connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: , request: "POST /demo HTTP/1.1", subrequest: "/capture/getstatus", upstream: "http://192.168.99.100:8080/api/demo/
比如
upstream timed out (110: Connection timed out) while reading response header from upstream
探测机制Nginx 默认判断失败节点状态以connect refuse和time out状态为准,不以HTTP错误状态进行判断失败,因为HTTP只要能返回状态说明该节点还可以正常连接,所以nginx判断其还是存活状态;除非添加了proxy_next_upstream指令设置对404、502、503、504、500和time out等错误进行转到备机处理,在next_upstream过程中,会对fails进行累加,如果备用机处理还是错误则直接返回错误信息(但404不进行记录到错误数,如果不配置错误状态也不对其进行错误状态记录),综述,nginx记录错误数量只记录timeout 、connect refuse、502、500、503、504这6种状态,timeout和connect refuse是永远被记录错误状态,而502、500、503、504只有在配置proxy_next_upstream后nginx才会记录这4种HTTP错误到fails中,当fails大于等于max_fails时,则该节点失效.
实验log如果探测所有节点均失效,备机也为失效时,那么nginx会对所有节点恢复为有效,重新尝试探测有效节点,如果探测到有效节点则返回正确节点内容,如果还是全部错误,那么继续探测下去,当没有正确信息时,节点失效时默认返回状态为502,但是下次访问节点时会继续探测正确节点,直到找到正确的为止。
upstream test_server{ server 192.168.99.100:80801; server 192.168.99.100:80802; server 192.168.99.100:80803; } ##for capture location /api/test/demo{ proxy_pass http://test_server/api/demo; } location /api/demo{ default_type application/json; content_by_lua_file conf/lua/demo.lua; }
lua
local cjson = require "cjson.safe" testres = ngx.location.capture("/api/test/demo",{ method= ngx.HTTP_POST, body = "arg1=xxxx&arg2=xxxxx" }) ngx.log(ngx.ERR,"status"..testres.status) local testbody = cjson.decode(testres.body) ngx.log(ngx.ERR,testbody==nil)
请求192.168.99.100:8080/api/demo,里头的lua会发起一个capture,请求/api/test/demo
请求一次
2017/02/09 14:48:57 [error] 5#5: *1 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", subrequest: "/api/test/demo", upstream: "http://192.168.99.100:80801/api/demo", host: "192.168.99.100:8080" 2017/02/09 14:48:57 [warn] 5#5: *1 upstream server temporarily disabled while connecting to upstream, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", subrequest: "/api/test/demo", upstream: "http://192.168.99.100:80801/api/demo", host: "192.168.99.100:8080" 2017/02/09 14:48:57 [error] 5#5: *1 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", subrequest: "/api/test/demo", upstream: "http://192.168.99.100:80802/api/demo", host: "192.168.99.100:8080" 2017/02/09 14:48:57 [warn] 5#5: *1 upstream server temporarily disabled while connecting to upstream, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", subrequest: "/api/test/demo", upstream: "http://192.168.99.100:80802/api/demo", host: "192.168.99.100:8080" 2017/02/09 14:48:57 [error] 5#5: *1 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", subrequest: "/api/test/demo", upstream: "http://192.168.99.100:80803/api/demo", host: "192.168.99.100:8080" 2017/02/09 14:48:57 [warn] 5#5: *1 upstream server temporarily disabled while connecting to upstream, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", subrequest: "/api/test/demo", upstream: "http://192.168.99.100:80803/api/demo", host: "192.168.99.100:8080" 2017/02/09 14:48:57 [error] 5#5: *1 [lua] demo.lua:44: status502 while sending to client, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", host: "192.168.99.100:8080"
对upstream逐个请求,都失败,则capture的subrequest返回502,对client返回的status code取决于lua脚本
再请求一次
2017/02/09 15:09:34 [error] 6#6: *11 no live upstreams while connecting to upstream, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", subrequest: "/api/test/demo", upstream: "http://test_server/api/demo", host: "192.168.99.100:8080"
doc该upstream下面的server都挂的情况下出现no live upstreams while connecting to upstream
ngx_http_upstream_module
nginx中健康检查(health_check)机制深入分析
nginx upstream 容错机制 原创-胡志广
线上nginx的一次“no live upstreams while connecting to upstream ”分析
文章版权归作者所有,未经允许请勿转载,若此文章存在违规行为,您可以联系管理员删除。
转载请注明本文地址:https://www.ucloud.cn/yun/39466.html
摘要:是由淘宝网发起的服务器项目。回源监控是内容分发网络的简称,其分发的内容来自用户源站,负责回源的模块是最重要组成部分之一,使跨越单机的限制,完成网络数据的接收处理和转发。这部分主要介绍的一些调试技巧和回源资源监控的内容,以及相应的实例分享。 摘要: Tengine是由淘宝网发起的Web服务器项目。它在Nginx的基础上,针对大访问量网站的需求,提供更强大的流量负载均衡能力、全站HTTPS...
摘要:本身是不支持的,如果需要使用这种调度算法,必须下载的模块。表示当前的暂时不参与负载。允许请求失败的次数,默认为。当超过最大次数时,返回模块定义的错误。 nginx 的 upstream 模块 负载均衡分配策略 普通轮询(默认) 每个请求按时间顺序逐一分配到不同的后端服务器,如果后端某台服务器宕机,故障系统被自动剔除,使用户访问不受影响。 upstream backend { serv...
摘要:这个指令属于模块的,指定后端返回什么样的异常响应时,使用另一个是专门提供负载均衡器内节点的健康检查的外部模块,由淘宝的姚伟斌大神开发,通过它可以用来检测后端的健康状态。 关于nginx的安装和基本配置请参考nginx,本文在原基础上完成以下几个功能: 结合proxy和upstream模块实现nginx负载均衡 结合nginx_upstream_check_module模块实现后端服...
阅读 3054·2021-11-25 09:43
阅读 1648·2021-11-24 11:15
阅读 2369·2021-11-22 15:25
阅读 3515·2021-11-11 16:55
阅读 3251·2021-11-04 16:10
阅读 2782·2021-09-14 18:02
阅读 1694·2021-09-10 10:50
阅读 1080·2019-08-29 15:39