Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use internal LB's IP for intra-node communication #5209

Merged
merged 1 commit into from
Oct 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,12 @@ export CI_RG ?= $(AZWI_RESOURCE_GROUP)
export USER_IDENTITY ?= $(addsuffix $(RANDOM_SUFFIX),$(CI_RG))
export AZURE_IDENTITY_ID_FILEPATH ?= $(ROOT_DIR)/azure_identity_id

# ensure that the APISERVER_LB_DNS_SUFFIX is of length 10 and contains only alphanumeric characters
# LC_ALL=C is used to set the locale to C to ensure that the output is ASCII
# head /dev/urandom generates random bytes. Will work on Linux and MacOS. Also works on Windows with WSL.
# Ignore SIGPIPE error. This will suppress error messages if head closes the pipe before tr finishes.
export APISERVER_LB_DNS_SUFFIX := $(shell LC_ALL=C tr -dc 'a-z0-9' < /dev/urandom | head -c 10 2>/dev/null)

## --------------------------------------
## Binaries
## --------------------------------------
Expand Down
4 changes: 4 additions & 0 deletions api/v1beta1/tags.go
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,10 @@ const (
// APIServerRole describes the value for the apiserver role.
APIServerRole = "apiserver"

// APIServerRoleInternal describes the value for the apiserver-internal role,
// an identifier for an internal load balancer serving apiserver traffic for cluster nodes.
APIServerRoleInternal = "apiserver-internal"
nawazkh marked this conversation as resolved.
Show resolved Hide resolved

// NodeOutboundRole describes the value for the node outbound LB role.
NodeOutboundRole = "nodeOutbound"

Expand Down
29 changes: 29 additions & 0 deletions azure/scope/cluster.go
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,35 @@ func (s *ClusterScope) LBSpecs() []azure.ResourceSpecGetter {
},
}

if s.APIServerLB().Type != infrav1.Internal {
specs = append(specs, &loadbalancers.LBSpec{
Name: s.APIServerLB().Name + "-internal",
ResourceGroup: s.ResourceGroup(),
SubscriptionID: s.SubscriptionID(),
ClusterName: s.ClusterName(),
Location: s.Location(),
ExtendedLocation: s.ExtendedLocation(),
VNetName: s.Vnet().Name,
VNetResourceGroup: s.Vnet().ResourceGroup,
SubnetName: s.ControlPlaneSubnet().Name,
FrontendIPConfigs: []infrav1.FrontendIP{
{
Name: s.APIServerLB().Name + "-internal-frontEnd", // TODO: improve this name.
FrontendIPClass: infrav1.FrontendIPClass{
PrivateIPAddress: infrav1.DefaultInternalLBIPAddress,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be made configurable via the spec in the future as @willie-yao mentioned below. Might be worth opening an issue for improvement

},
},
},
APIServerPort: s.APIServerPort(),
Type: infrav1.Internal,
SKU: s.APIServerLB().SKU,
Role: infrav1.APIServerRoleInternal,
BackendPoolName: s.APIServerLB().BackendPool.Name + "-internal",
IdleTimeoutInMinutes: s.APIServerLB().IdleTimeoutInMinutes,
AdditionalTags: s.AdditionalTags(),
})
}

// Node outbound LB
if s.NodeOutboundLB() != nil {
specs = append(specs, &loadbalancers.LBSpec{
Expand Down
38 changes: 35 additions & 3 deletions azure/scope/cluster_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -2232,20 +2232,25 @@ func TestBackendPoolName(t *testing.T) {
}
clusterScope.AzureCluster.SetBackendPoolNameDefault()
got := clusterScope.LBSpecs()
g.Expect(got).To(HaveLen(3))
g.Expect(got).To(HaveLen(4))

// API server backend pool name
apiServerLBSpec := got[0].(*loadbalancers.LBSpec)
g.Expect(apiServerLBSpec.BackendPoolName).To(Equal(tc.expectedAPIServerBackendPoolName))
g.Expect(apiServerLBSpec.Role).To(Equal(infrav1.APIServerRole))

// API server backend pool name
apiServerILBSpec := got[1].(*loadbalancers.LBSpec)
g.Expect(apiServerILBSpec.BackendPoolName).To(Equal(tc.expectedAPIServerBackendPoolName + "-internal"))
g.Expect(apiServerILBSpec.Role).To(Equal(infrav1.APIServerRoleInternal))

// Node backend pool name
NodeLBSpec := got[1].(*loadbalancers.LBSpec)
NodeLBSpec := got[2].(*loadbalancers.LBSpec)
g.Expect(NodeLBSpec.BackendPoolName).To(Equal(tc.expectedNodeBackendPoolName))
g.Expect(NodeLBSpec.Role).To(Equal(infrav1.NodeOutboundRole))

// Control Plane backend pool name
controlPlaneLBSpec := got[2].(*loadbalancers.LBSpec)
controlPlaneLBSpec := got[3].(*loadbalancers.LBSpec)
g.Expect(controlPlaneLBSpec.BackendPoolName).To(Equal(tc.expectedControlPlaneBackendPoolName))
g.Expect(controlPlaneLBSpec.Role).To(Equal(infrav1.ControlPlaneOutboundRole))
})
Expand Down Expand Up @@ -2667,6 +2672,33 @@ func TestClusterScope_LBSpecs(t *testing.T) {
"foo": "bar",
},
},
&loadbalancers.LBSpec{
Name: "api-server-lb-internal",
ResourceGroup: "my-rg",
SubscriptionID: "123",
ClusterName: "my-cluster",
Location: "westus2",
VNetName: "my-vnet",
VNetResourceGroup: "my-rg",
SubnetName: "cp-subnet",
FrontendIPConfigs: []infrav1.FrontendIP{
{
Name: "api-server-lb-internal-frontEnd",
FrontendIPClass: infrav1.FrontendIPClass{
PrivateIPAddress: infrav1.DefaultInternalLBIPAddress,
},
},
},
APIServerPort: 6443,
Type: infrav1.Internal,
SKU: infrav1.SKUStandard,
Role: infrav1.APIServerRoleInternal,
BackendPoolName: "api-server-lb-backend-pool-internal",
IdleTimeoutInMinutes: ptr.To[int32](30),
AdditionalTags: infrav1.Tags{
"foo": "bar",
},
},
&loadbalancers.LBSpec{
Name: "node-outbound-lb",
ResourceGroup: "my-rg",
Expand Down
2 changes: 2 additions & 0 deletions azure/scope/machine.go
Original file line number Diff line number Diff line change
Expand Up @@ -298,6 +298,8 @@ func (m *MachineScope) BuildNICSpec(nicName string, infrav1NetworkInterface infr
spec.InternalLBName = m.APIServerLBName()
spec.InternalLBAddressPoolName = m.APIServerLBPoolName()
} else {
spec.InternalLBName = m.APIServerLBName() + "-internal"
spec.InternalLBAddressPoolName = m.APIServerLBPoolName() + "-internal"
spec.PublicLBNATRuleName = m.Name()
spec.PublicLBAddressPoolName = m.APIServerLBPoolName()
}
Expand Down
8 changes: 4 additions & 4 deletions azure/scope/machine_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -2467,8 +2467,8 @@ func TestMachineScope_NICSpecs(t *testing.T) {
PublicLBName: "api-lb",
PublicLBAddressPoolName: "api-lb-backendPool",
PublicLBNATRuleName: "machine-name",
InternalLBName: "",
InternalLBAddressPoolName: "",
InternalLBName: "api-lb-internal",
InternalLBAddressPoolName: "api-lb-backendPool-internal",
PublicIPName: "",
AcceleratedNetworking: nil,
DNSServers: nil,
Expand Down Expand Up @@ -2578,8 +2578,8 @@ func TestMachineScope_NICSpecs(t *testing.T) {
PublicLBName: "api-lb",
PublicLBAddressPoolName: "api-lb-backendPool",
PublicLBNATRuleName: "machine-name",
InternalLBName: "",
InternalLBAddressPoolName: "",
InternalLBName: "api-lb-internal",
InternalLBAddressPoolName: "api-lb-backendPool-internal",
PublicIPName: "",
AcceleratedNetworking: nil,
DNSServers: []string{"123.123.123.123", "124.124.124.124"},
Expand Down
4 changes: 2 additions & 2 deletions azure/services/loadbalancers/spec.go
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@ func getOutboundRules(lbSpec LBSpec, frontendIDs []*armnetwork.SubResource) []*a
}

func getLoadBalancingRules(lbSpec LBSpec, frontendIDs []*armnetwork.SubResource) []*armnetwork.LoadBalancingRule {
if lbSpec.Role == infrav1.APIServerRole {
if lbSpec.Role == infrav1.APIServerRole || lbSpec.Role == infrav1.APIServerRoleInternal {
// We disable outbound SNAT explicitly in the HTTPS LB rule and enable TCP and UDP outbound NAT with an outbound rule.
// For more information on Standard LB outbound connections see https://learn.microsoft.com/azure/load-balancer/load-balancer-outbound-connections.
var frontendIPConfig *armnetwork.SubResource
Expand Down Expand Up @@ -255,7 +255,7 @@ func getBackendAddressPools(lbSpec LBSpec) []*armnetwork.BackendAddressPool {
}

func getProbes(lbSpec LBSpec) []*armnetwork.Probe {
if lbSpec.Role == infrav1.APIServerRole {
if lbSpec.Role == infrav1.APIServerRole || lbSpec.Role == infrav1.APIServerRoleInternal {
return []*armnetwork.Probe{
{
Name: ptr.To(httpsProbe),
Expand Down
25 changes: 21 additions & 4 deletions scripts/aks-as-mgmt.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,11 @@ source "${REPO_ROOT}/hack/ensure-azcli.sh" # install az cli and login using WI
source "${REPO_ROOT}/hack/ensure-tags.sh" # set the right timestamp and job name

KUBECTL="${REPO_ROOT}/hack/tools/bin/kubectl"
KIND="${REPO_ROOT}/hack/tools/bin/kind"
AZWI="${REPO_ROOT}/hack/tools/bin/azwi"
make --directory="${REPO_ROOT}" "${KUBECTL##*/}" "${KIND##*/}" "${AZWI##*/}"
make --directory="${REPO_ROOT}" "${KUBECTL##*/}" "${AZWI##*/}"

export MGMT_CLUSTER_NAME="${MGMT_CLUSTER_NAME:-aks-mgmt-capz}-${RANDOM_SUFFIX}" # management cluster name
export AKS_RESOURCE_GROUP="${AKS_RESOURCE_GROUP:-aks-mgmt-capz}-${RANDOM_SUFFIX}" # resource group name
export MGMT_CLUSTER_NAME="${MGMT_CLUSTER_NAME:-aks-mgmt-capz-${RANDOM_SUFFIX}}" # management cluster name
export AKS_RESOURCE_GROUP="${AKS_RESOURCE_GROUP:-aks-mgmt-capz-${RANDOM_SUFFIX}}" # resource group name
export AKS_NODE_RESOURCE_GROUP="node-${AKS_RESOURCE_GROUP}"
export KUBERNETES_VERSION="${KUBERNETES_VERSION:-v1.30.2}"
export AZURE_LOCATION="${AZURE_LOCATION:-westus2}"
Expand All @@ -48,6 +47,16 @@ export AZURE_SUBSCRIPTION_ID="${AZURE_SUBSCRIPTION_ID:-}"
export AZURE_CLIENT_ID="${AZURE_CLIENT_ID:-}"
export AZURE_TENANT_ID="${AZURE_TENANT_ID:-}"

# to suppress unbound variable error message
export APISERVER_LB_DNS_SUFFIX="${APISERVER_LB_DNS_SUFFIX:-}"
export AKS_MI_CLIENT_ID="${AKS_MI_CLIENT_ID:-}"
export AKS_MI_OBJECT_ID="${AKS_MI_OBJECT_ID:-}"
export AKS_MI_RESOURCE_ID="${AKS_MI_RESOURCE_ID:-}"
export MANAGED_IDENTITY_NAME="${MANAGED_IDENTITY_NAME:-}"
export MANAGED_IDENTITY_RG="${MANAGED_IDENTITY_RG:-}"
export ASO_CREDENTIAL_SECRET_MODE="${ASO_CREDENTIAL_SECRET_MODE:-}"
export SKIP_AKS_CREATE="${SKIP_AKS_CREATE:-false}"

main() {

echo "--------------------------------"
Expand All @@ -66,12 +75,19 @@ main() {
echo "SERVICE_ACCOUNT_SIGNING_PUB_FILEPATH: $SERVICE_ACCOUNT_SIGNING_PUB_FILEPATH"
echo "SERVICE_ACCOUNT_SIGNING_KEY_FILEPATH: $SERVICE_ACCOUNT_SIGNING_KEY_FILEPATH"
echo "REGISTRY: $REGISTRY"
echo "APISERVER_LB_DNS_SUFFIX: $APISERVER_LB_DNS_SUFFIX"

echo "AZURE_SUBSCRIPTION_ID: $AZURE_SUBSCRIPTION_ID"
echo "AZURE_CLIENT_ID: $AZURE_CLIENT_ID"
echo "AZURE_TENANT_ID: $AZURE_TENANT_ID"
echo "--------------------------------"

# if using SKIP_AKS_CREATE=true, skip creating the AKS cluster
if [[ "${SKIP_AKS_CREATE}" == "true" ]]; then
echo "Skipping AKS cluster creation"
return
fi

create_aks_cluster
set_env_varaibles
}
Expand Down Expand Up @@ -178,6 +194,7 @@ kustomize_substitutions:
CLUSTER_IDENTITY_TYPE: "UserAssignedMSI"
ASO_CREDENTIAL_SECRET_MODE: "${ASO_CREDENTIAL_SECRET_MODE}"
REGISTRY: "${REGISTRY}"
APISERVER_LB_DNS_SUFFIX: "${APISERVER_LB_DNS_SUFFIX}"
allowed_contexts:
- "$MGMT_CLUSTER_NAME"
- "kind-capz"
Expand Down
10 changes: 9 additions & 1 deletion templates/cluster-template-azure-bastion.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 9 additions & 1 deletion templates/cluster-template-azure-cni-v1.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 9 additions & 1 deletion templates/cluster-template-edgezone.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 9 additions & 1 deletion templates/cluster-template-ephemeral.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 12 additions & 2 deletions templates/cluster-template-windows.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 9 additions & 1 deletion templates/cluster-template.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,12 @@ spec:
name: ${CLUSTER_IDENTITY_NAME}
location: ${AZURE_LOCATION}
networkSpec:
apiServerLB:
frontendIPs:
- name: ${CLUSTER_NAME}-api-lb
publicIP:
dnsName: ${CLUSTER_NAME}-${APISERVER_LB_DNS_SUFFIX}.${AZURE_LOCATION}.cloudapp.azure.com
name: ${CLUSTER_NAME}-api-lb
subnets:
- name: control-plane-subnet
role: control-plane
Expand Down Expand Up @@ -189,7 +195,9 @@ spec:
kubeletExtraArgs:
cloud-provider: external
name: '{{ ds.meta_data["local_hostname"] }}'
preKubeadmCommands: []
preKubeadmCommands:
- echo '10.0.0.100 ${CLUSTER_NAME}-${APISERVER_LB_DNS_SUFFIX}.${AZURE_LOCATION}.cloudapp.azure.com'
nawazkh marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be cleaner to do this in postKubeadmCommands:

- echo '10.0.0.100 $(grep 'server:' /etc/kubernetes/kubelet.conf | awk -F[/:] '{print $5}') >> /etc/hosts

This approach would allow us to entirely skip the need to pass in the dnsName manually in another part of the template, and we'd just rely upon the existing foo and kubeadm to tell us this info.

I'm pretty sure we'd need to do this as a postKubeadmCommand, and not preKubeadmCommand, kubeadm command. I would assume that kubeadm hasn't yet paved the kubeconfig by the time we invoke preKubeadmCommands.

Copy link
Member Author

@nawazkh nawazkh Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach would allow us to entirely skip the need to pass in the dnsName manually in another part of the template, and we'd just rely upon the existing foo and kubeadm to tell us this info.

I like the idea. One less variable to manage.

However, isn't it better to setup custom DNS resolution for the API Server in preKubeadmCommands than in postKubeadmCommands?
By using preKubeadmCommand, in this scenario, we can be sure that no Azure network policies will interfere in the kubeadm joinstep.
Meaning, if Azure policies (blocking internet access on the VM) were to kick in immediately after a VM comes up, preKubeadmCommands will ensure that kubeadm join runs successfully.

What do you say?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will /etc/kubernetes/kubelet.conf be present at the time that preKubeadmCommands runs? I assume that kubeadm is responsible for paving that file, which means we have to wait.

But that may not be practical for Windows nodes that come online later (it may be too late and as you mention, kubeadm join may fail and block postKubeadmCommands from ever executing.

cc @jsturtevant

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's postpone worrying about this for now so we can land this change sooner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will /etc/kubernetes/kubelet.conf be present at the time that preKubeadmCommands runs?

I don't think it will be present, kubeadm creates it.

>> /etc/hosts
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureClusterIdentity
Expand Down
3 changes: 3 additions & 0 deletions templates/flavors/default/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,11 @@ resources:
- ../base
- machine-deployment.yaml
- ../../azure-cluster-identity

patches:
- path: ../../azure-cluster-identity/azurecluster-identity-ref.yaml
- path: patches/control-plane.yaml
- path: patches/kubeadm-config-template.yaml

sortOptions:
order: fifo
Loading
Loading