-
Notifications
You must be signed in to change notification settings - Fork 391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add consul_services_max_fails, consul_services_fail_timeout, fallback_peer and upsync_weight NodeMeta #200
base: master
Are you sure you want to change the base?
Conversation
1. Backport changes from lyokha/nginx-upsync-module repository: 1.1 Add consul_services_max_fails and consul_services_fail_timeout directive to control max_fails and fail_timeout values for backends 1.2 Add fallback_peer directive. It allowes to delete all backends in upstream and set fallback_peer address marked as down as a once backend in upstream. Otherwise error "cannot delete all peers" will occure 2. Add functionality to change backend weight based on consul agent node NodeMeta (available in Consul 0.7.3). Tag named as "upsync_weight". It could be usefull when you have different hardware configurtions in your environment. And could be more flexible than "weight" ServiceTag. For example: CPUs from Intel and AMD has different speed, and we should be able to balance requests based on CPU speed (based on node weight). More powerfull nodes will have higher weight and should be able to serve more requests 3. Seed srandom PRNG with PID, seconds, and milliseconds 4. Add current_weight, conns, max_conns and fails in show_upstream. It more usefull for backend status monitoring
hi eugenepaniot,
Thanks again. |
Yes, these two directives upsync_consul_services_max_fails and upsync_consul_services_fail_timeout controlling default values then service haven't tagged.
Ok, agree. I'll remove it. |
When our application has 0 instances then upstream should be zeroed too. In our environment we are using mesos/marathon for containers orchestration. When container started on compute node, mesos assign random port for iptables NAT rules, in a worst scenario (we are facing on it few times because we have a huge conainers cluster) nginx proxying requests to wrong destination because upsteam haven't cleared.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in "if (tags == NULL)'
Sorry eugenepaniot, you don't persuade me. The attributes of server can get from consul. If you haven't tagged, it seems you don't care so you haven't tagged. |
The module is not support to delete all servers of one upstream, which is to avoid causing service useless. |
But what if this is an expected behaviour? In my case I need to clear all backend list cause I suspend my application.
Yes, correct.
You mean "fallback_peer" as a tag? But I won't have a service registration (and tags too) because I dont have a running service instances, or from KV ? |
We also have such cases where all upstream servers unregistered from consul. That's why we put a dummy "server" in the "upstream", eg:
This is to make nginx happy, since nginx requires at least one server in the pool. Why can't you use that for your fallback peer? It will get replaced once consul returns live servers. |
2. Add max_stale GET parameter to service discovery request 3. Add wait GET parameter to service discovery request
为什么要提这个? 扩容:curl -X PUT -d '{"id":"111","name":"testnginx","address":"127.0.0.1","port":8001, "tags":["weight=4", "fail_timeout=13s", "max_fails=19"]}' http://127.0.0.1:8500/v1/agent/service/register 缩容:curl -X PUT -d '{"id":"111","name":"testnginx","address":"127.0.0.1","port":8001, "tags":["weight=4", "fail_timeout=13s", "max_fails=19", "down"]}' http://127.0.0.1:8500/v1/agent/service/register 或者 把服务下掉 |
This patch add following changes:
1.1 Add consul_services_max_fails and consul_services_fail_timeout directive to control max_fails and fail_timeout values for backends
1.2 Add fallback_peer directive. It allowes to delete all backends in upstream and set fallback_peer address marked as down as a once backend in upstream. Otherwise error "cannot delete all peers" will occure
For example: CPUs from Intel and AMD has different speed, and we should be able to balance requests based on CPU speed (based on node weight). More powerfull nodes will have higher weight and should be able to serve more requests