-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use internal LB's IP for intra-node communication #5209
Use internal LB's IP for intra-node communication #5209
Conversation
Skipping CI for Draft Pull Request. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5209 +/- ##
==========================================
+ Coverage 52.66% 53.02% +0.36%
==========================================
Files 273 273
Lines 29189 29226 +37
==========================================
+ Hits 15371 15498 +127
+ Misses 13029 12926 -103
- Partials 789 802 +13 ☔ View full report in Codecov by Sentry. |
/hold for squash. Eager to see what's breaking. |
9b5151d
to
7734183
Compare
spec: | ||
networkSpec: | ||
apiServerLB: | ||
# Hack: We pre-create this public IP and the DNS name to use it in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can get rid of this comment, as (arguably) this is a normal configuration and not a hack, and the security rules referred to are not universal across Azure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another solution would be to update the IP address that is passed to CAPI kubeadm. The approach taken here has the advantage of not needing to have the management cluster peered to the worker node vnet. I guess scenarios where you wanted it fully private you could go with a private cluster and bolt on a public IP address with ASO, if desired, later once Azure/azure-service-operator#4368 is resolved.
Anyways keeping some form of comment here is probably useful for our future selves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of a comment that clarifies any housekeeping to ensure that nodes are able to connect to the apiserver locally (whether or not there is a private or public apiserver endpoint), but I don't think we want to reference a "security policy", which is not general for Azure, and is subject to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another solution would be to update the IP address that is passed to CAPI kubeadm
Is that IP Address present in /etc/kubernetes/kubelet.conf
as referenced in Jack's proposal here #5209 (comment) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of a comment that clarifies any housekeeping to ensure that nodes are able to connect to the apiserver locally (whether or not there is a private or public apiserver endpoint), but I don't think we want to reference a "security policy", which is not general for Azure, and is subject to change.
Addressed via 4996a36
@@ -189,7 +195,9 @@ spec: | |||
kubeletExtraArgs: | |||
cloud-provider: external | |||
name: '{{ ds.meta_data["local_hostname"] }}' | |||
preKubeadmCommands: [] | |||
preKubeadmCommands: | |||
- echo '10.0.0.100 ${CLUSTER_NAME}-${APISERVER_LB_DNS_SUFFIX}.${AZURE_LOCATION}.cloudapp.azure.com' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be cleaner to do this in postKubeadmCommands
:
- echo '10.0.0.100 $(grep 'server:' /etc/kubernetes/kubelet.conf | awk -F[/:] '{print $5}') >> /etc/hosts
This approach would allow us to entirely skip the need to pass in the dnsName manually in another part of the template, and we'd just rely upon the existing foo and kubeadm to tell us this info.
I'm pretty sure we'd need to do this as a postKubeadmCommand
, and not preKubeadmCommand
, kubeadm command. I would assume that kubeadm hasn't yet paved the kubeconfig by the time we invoke preKubeadmCommands
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach would allow us to entirely skip the need to pass in the dnsName manually in another part of the template, and we'd just rely upon the existing foo and kubeadm to tell us this info.
I like the idea. One less variable to manage.
However, isn't it better to setup custom DNS resolution for the API Server in preKubeadmCommands
than in postKubeadmCommands
?
By using preKubeadmCommand
, in this scenario, we can be sure that no Azure network policies will interfere in the kubeadm join
step.
Meaning, if Azure policies (blocking internet access on the VM) were to kick in immediately after a VM comes up, preKubeadmCommands
will ensure that kubeadm join
runs successfully.
What do you say?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will /etc/kubernetes/kubelet.conf
be present at the time that preKubeadmCommands
runs? I assume that kubeadm is responsible for paving that file, which means we have to wait.
But that may not be practical for Windows nodes that come online later (it may be too late and as you mention, kubeadm join may fail and block postKubeadmCommands
from ever executing.
cc @jsturtevant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's postpone worrying about this for now so we can land this change sooner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will /etc/kubernetes/kubelet.conf be present at the time that preKubeadmCommands runs?
I don't think it will be present, kubeadm creates it.
{ | ||
Name: s.APIServerLB().Name + "-ilb-frontEnd", | ||
FrontendIPClass: infrav1.FrontendIPClass{ | ||
PrivateIPAddress: infrav1.DefaultInternalLBIPAddress, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be made configurable via the spec in the future as @willie-yao mentioned below. Might be worth opening an issue for improvement
Dropping all the tweaks that enabled running Ginkgo tests locally. |
/retest |
0b5d19c
to
4996a36
Compare
- Pre-create a DNS Name for the frontend LB's IP that get assigned to the controlplane VM. - Pre-create internal LB for the API server and set default internal IP for it. - Update templates with new DNS name for the API Server. - Update /etc/hosts on the worker nodes with internal Loadbalancer's IP and DNS name of the public LB.
4996a36
to
25bca8d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
LGTM label has been added. Git tree hash: 454a533cb3eb8019dc9babf2ed32b32320c4b4f4
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jackfrancis The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm Thanks for helping fix this up! |
@nawazkh: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Can we drop the hold now that commits are squashed? |
/retest |
/unhold |
/label kind/feature |
@jackfrancis: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/kind feature |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes: #5258
Special notes for your reviewer:
Below templates are the only valid changes
templates/cluster-template-windows.yaml
templates/cluster-template.yaml
PR: Update self-managed templates to use internal LB for node-to-node communication #5210 should fix this issue. Below templates are not expected to change. Will probe further to update the kustomization.yaml for respective flavors
templates/cluster-template-ephemeral.yaml
templates/cluster-template-edgezone.yaml
templates/cluster-template-azure-cni-v1.yaml
templates/cluster-template-azure-bastion.yaml
cherry-pick candidate
TODOs:
Release note: