Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use internal LB's IP for intra-node communication #5209

Merged
merged 1 commit into from
Oct 30, 2024

Conversation

nawazkh
Copy link
Member

@nawazkh nawazkh commented Oct 24, 2024

What type of PR is this?
/kind feature

What this PR does / why we need it:

  • Add pre-kubeadm join commands to worker nodes
  • Also pre-create a DNS Name for the frontend LB's IP that get assigned to the controlplane VM.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes: #5258

Special notes for your reviewer:

  • Below templates are the only valid changes

    • templates/cluster-template-windows.yaml
    • templates/cluster-template.yaml
  • PR: Update self-managed templates to use internal LB for node-to-node communication #5210 should fix this issue. Below templates are not expected to change. Will probe further to update the kustomization.yaml for respective flavors

    • templates/cluster-template-ephemeral.yaml
    • templates/cluster-template-edgezone.yaml
    • templates/cluster-template-azure-cni-v1.yaml
    • templates/cluster-template-azure-bastion.yaml
  • cherry-pick candidate

TODOs:

  • squashed commits
  • includes documentation
  • adds unit tests

Release note:

Use internal LB's IP for intra-node communication

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 24, 2024
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 24, 2024
Copy link

codecov bot commented Oct 24, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 53.02%. Comparing base (9ba44ee) to head (25bca8d).
Report is 29 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5209      +/-   ##
==========================================
+ Coverage   52.66%   53.02%   +0.36%     
==========================================
  Files         273      273              
  Lines       29189    29226      +37     
==========================================
+ Hits        15371    15498     +127     
+ Misses      13029    12926     -103     
- Partials      789      802      +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

azure/scope/cluster.go Outdated Show resolved Hide resolved
scripts/aks-as-mgmt.sh Outdated Show resolved Hide resolved
@nawazkh
Copy link
Member Author

nawazkh commented Oct 28, 2024

/hold for squash. Eager to see what's breaking.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 28, 2024
@nawazkh nawazkh marked this pull request as ready for review October 28, 2024 18:54
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 28, 2024
spec:
networkSpec:
apiServerLB:
# Hack: We pre-create this public IP and the DNS name to use it in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can get rid of this comment, as (arguably) this is a normal configuration and not a hack, and the security rules referred to are not universal across Azure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another solution would be to update the IP address that is passed to CAPI kubeadm. The approach taken here has the advantage of not needing to have the management cluster peered to the worker node vnet. I guess scenarios where you wanted it fully private you could go with a private cluster and bolt on a public IP address with ASO, if desired, later once Azure/azure-service-operator#4368 is resolved.

Anyways keeping some form of comment here is probably useful for our future selves.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of a comment that clarifies any housekeeping to ensure that nodes are able to connect to the apiserver locally (whether or not there is a private or public apiserver endpoint), but I don't think we want to reference a "security policy", which is not general for Azure, and is subject to change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another solution would be to update the IP address that is passed to CAPI kubeadm

Is that IP Address present in /etc/kubernetes/kubelet.conf as referenced in Jack's proposal here #5209 (comment) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of a comment that clarifies any housekeeping to ensure that nodes are able to connect to the apiserver locally (whether or not there is a private or public apiserver endpoint), but I don't think we want to reference a "security policy", which is not general for Azure, and is subject to change.

Addressed via 4996a36

@@ -189,7 +195,9 @@ spec:
kubeletExtraArgs:
cloud-provider: external
name: '{{ ds.meta_data["local_hostname"] }}'
preKubeadmCommands: []
preKubeadmCommands:
- echo '10.0.0.100 ${CLUSTER_NAME}-${APISERVER_LB_DNS_SUFFIX}.${AZURE_LOCATION}.cloudapp.azure.com'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be cleaner to do this in postKubeadmCommands:

- echo '10.0.0.100 $(grep 'server:' /etc/kubernetes/kubelet.conf | awk -F[/:] '{print $5}') >> /etc/hosts

This approach would allow us to entirely skip the need to pass in the dnsName manually in another part of the template, and we'd just rely upon the existing foo and kubeadm to tell us this info.

I'm pretty sure we'd need to do this as a postKubeadmCommand, and not preKubeadmCommand, kubeadm command. I would assume that kubeadm hasn't yet paved the kubeconfig by the time we invoke preKubeadmCommands.

Copy link
Member Author

@nawazkh nawazkh Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach would allow us to entirely skip the need to pass in the dnsName manually in another part of the template, and we'd just rely upon the existing foo and kubeadm to tell us this info.

I like the idea. One less variable to manage.

However, isn't it better to setup custom DNS resolution for the API Server in preKubeadmCommands than in postKubeadmCommands?
By using preKubeadmCommand, in this scenario, we can be sure that no Azure network policies will interfere in the kubeadm joinstep.
Meaning, if Azure policies (blocking internet access on the VM) were to kick in immediately after a VM comes up, preKubeadmCommands will ensure that kubeadm join runs successfully.

What do you say?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will /etc/kubernetes/kubelet.conf be present at the time that preKubeadmCommands runs? I assume that kubeadm is responsible for paving that file, which means we have to wait.

But that may not be practical for Windows nodes that come online later (it may be too late and as you mention, kubeadm join may fail and block postKubeadmCommands from ever executing.

cc @jsturtevant

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's postpone worrying about this for now so we can land this change sooner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will /etc/kubernetes/kubelet.conf be present at the time that preKubeadmCommands runs?

I don't think it will be present, kubeadm creates it.

{
Name: s.APIServerLB().Name + "-ilb-frontEnd",
FrontendIPClass: infrav1.FrontendIPClass{
PrivateIPAddress: infrav1.DefaultInternalLBIPAddress,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be made configurable via the spec in the future as @willie-yao mentioned below. Might be worth opening an issue for improvement

azure/scope/machine.go Outdated Show resolved Hide resolved
@nawazkh
Copy link
Member Author

nawazkh commented Oct 29, 2024

Dropping all the tweaks that enabled running Ginkgo tests locally.

@nawazkh
Copy link
Member Author

nawazkh commented Oct 29, 2024

/retest

- Pre-create a DNS Name for the frontend LB's IP that get assigned to the controlplane VM.
- Pre-create internal LB for the API server and set default internal IP for it.
- Update templates with new DNS name for the API Server.
- Update /etc/hosts on the worker nodes with internal Loadbalancer's IP and DNS name of the public LB.
@nawazkh nawazkh changed the title Update Dev Templates Use internal LB's IP for intra-node communication Oct 29, 2024
Copy link
Contributor

@jackfrancis jackfrancis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 29, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 454a533cb3eb8019dc9babf2ed32b32320c4b4f4

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jackfrancis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 29, 2024
@jsturtevant
Copy link
Contributor

/lgtm

Thanks for helping fix this up!

@k8s-ci-robot
Copy link
Contributor

@nawazkh: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-azure-windows-custom-builds 25bca8d link false /test pull-cluster-api-provider-azure-windows-custom-builds

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@nojnhuh
Copy link
Contributor

nojnhuh commented Oct 30, 2024

Can we drop the hold now that commits are squashed?

@nawazkh
Copy link
Member Author

nawazkh commented Oct 30, 2024

/retest

@nawazkh
Copy link
Member Author

nawazkh commented Oct 30, 2024

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 30, 2024
@k8s-ci-robot k8s-ci-robot merged commit e591f24 into kubernetes-sigs:main Oct 30, 2024
32 of 33 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.18 milestone Oct 30, 2024
@nawazkh nawazkh deleted the update_kubeadm_configs branch October 30, 2024 06:06
@jackfrancis
Copy link
Contributor

/label kind/feature

@k8s-ci-robot
Copy link
Contributor

@jackfrancis: The label(s) /label kind/feature cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, team/katacoda, refactor, ci-short, ci-extended, ci-full. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?

In response to this:

/label kind/feature

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jackfrancis jackfrancis added kind/feature Categorizes issue or PR as related to a new feature. and removed kind/bug Categorizes issue or PR as related to a bug. labels Oct 31, 2024
@nojnhuh
Copy link
Contributor

nojnhuh commented Oct 31, 2024

/kind feature

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Add internal Load balancer for workload cluster's API Server
6 participants