Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nginx will be blocked while calling getaddrinfo #134

Open
bluestn opened this issue Jan 9, 2017 · 12 comments
Open

nginx will be blocked while calling getaddrinfo #134

bluestn opened this issue Jan 9, 2017 · 12 comments

Comments

@bluestn
Copy link

bluestn commented Jan 9, 2017

如果域名解析比较慢的话,那么在调用getaddrinfo的时候由于执行的是同步操作,导致整个nginx进程卡住。

0 0x00007ffff5d10680 in __poll_nocancel () from /lib64/libc.so.6
#1 0x00007ffff4a97dd4 in __libc_res_nsend () from /lib64/libresolv.so.2
#2 0x00007ffff4a95cce in __libc_res_nquery () from /lib64/libresolv.so.2
#3 0x00007ffff4a968b0 in __libc_res_nsearch () from /lib64/libresolv.so.2
#4 0x00007ffff26c6c53 in _nss_dns_gethostbyname4_r () from /lib64/libnss_dns.so.2
#5 0x00007ffff5d00c08 in gaih_inet () from /lib64/libc.so.6
#6 0x00007ffff5d042cd in getaddrinfo () from /lib64/libc.so.6
#7 0x000000000064441e in ngx_http_upsync_init_server (event=0x7ffff1f8e248) at ../modules/nginx-upsync-module/src/ngx_http_upsync_module.c:3042
#8 0x00000000006430ad in ngx_http_upsync_connect_handler (event=0x7ffff1f8e248) at ../modules/nginx-upsync-module/src/ngx_http_upsync_module.c:2536
#9 0x000000000064307f in ngx_http_upsync_begin_handler (event=0x7ffff1f8e248) at ../modules/nginx-upsync-module/src/ngx_http_upsync_module.c:2519
#10 0x000000000053fcc0 in ngx_event_expire_timers () at src/event/ngx_event_timer.c:94
#11 0x000000000053dd89 in ngx_process_events_and_timers (cycle=0x7ffff3394e10) at src/event/ngx_event.c:256
#12 0x000000000054836a in ngx_single_process_cycle (cycle=0x7ffff3394e10) at src/os/unix/ngx_process_cycle.c:332
#13 0x00000000005126bb in main (argc=1, argv=0x7fffffffe558) at src/core/nginx.c:364

@xiaokai-wang
Copy link
Member

尝试了?这个函数的功能不会有阻塞吧?

@bluestn
Copy link
Author

bluestn commented Jan 10, 2017

对的,我看到整个进程卡住,过好久域名解析出来了,才有反应。
if (getaddrinfo((char *) host, NULL, &hints, &res) != 0) {
这个调用应该是同步函数调用。

@xiaokai-wang
Copy link
Member

卡住一次,之后呢?我的理解这个函数不是每次都远程进行一个DNS 解析调用,而只是进行一个转换?

@bluestn
Copy link
Author

bluestn commented Jan 10, 2017

系统应该有一个缓存,但是如果长时间缓存不太会。

@xiaokai-wang
Copy link
Member

对,我的理解是,这个函数是去读这个缓存的,进行一个格式的转换 : ).

@bluestn
Copy link
Author

bluestn commented Jan 10, 2017

即便是缓存,也会有缓存过期的时间,会引起block的问题,建议还是能够改成异步调用。谢谢!

@xiaokai-wang
Copy link
Member

退一步,即使阻塞了,不会对正常的请求有影响,只是配置的更新稍微延迟一下;这个个人觉得暂时还好;
:)

@bluestn
Copy link
Author

bluestn commented Jan 11, 2017

不是的,因为nginx-upsync-module的timer是在work进程里面跑的,nginx是单线程模式的,一旦nginx-upsync-module在timer里面跑起来,block了,会导致其他连接的处理被挂住。不知道我理解的是不是正确。谢谢!

@xiaokai-wang
Copy link
Member

是的,如果阻塞的话,会影响本work 线程的请求;

你是怎么测试它block 了?gdb 第一次执行?然后我暂时并不认为它会阻塞,或许更多信息?

感谢反馈!

@bluestn
Copy link
Author

bluestn commented Jan 11, 2017

我将nginx搞成非daemon模式,只有一个worker,进行调试,当时正好dns解析好像有点问题,比较慢,结果整个nginx卡死了。既然单个worker有问题,按照您的设计模式,每个worker都会独自跑consul的upstream获取逻辑,自然应该也会在真实环境下面对多worker模式也会产生影响。
关于异步域名解析的代码我已经改好了,不知道怎么提交代码上来。

@xiaokai-wang
Copy link
Member

有没有尝试着单独测一下getaddrinfo 接口?写个testcase;

欢迎任何patch,按照代码格式提交便可,非常感谢!

@rouberg
Copy link

rouberg commented Dec 8, 2021

能不能不做域名解析, 配置域名时直接把域名设置成server呢, 因为域名可能映射的是动态ip.
@xiaokai-wang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants