Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods fail to start with annotation "kubernetes.io/egress-bandwidth" #791

Open
paulben opened this issue Sep 26, 2024 · 4 comments
Open

Pods fail to start with annotation "kubernetes.io/egress-bandwidth" #791

paulben opened this issue Sep 26, 2024 · 4 comments

Comments

@paulben
Copy link

paulben commented Sep 26, 2024

Pods with annotation kubernetes.io/egress-bandwidth: 10M fail to start with Network Observability Operator 1.6.2 installed. Pod events show:

...failed to create pod network sandbox k8s_php-sample-6cfff549d-7fvw5_mywebapp_88fa15ea-5251-4931-99f0-9c021f2f34a9_0(ebbdf6643f2ad7cf4b6cd0c82f7008db13219987206fb54d46355865b6e7aeda): error adding pod mywebapp_php-sample-6cfff549d-7fvw5 to CNI network "multus-cni-network"...

Which raises the question: Are there OS requirements for nodes?

The above failure occurs on OpenShift 4.14.34 with (AMD64) nodes at:

sh-4.4# cat /etc/redhat-release
Red Hat Enterprise Linux release 8.10 (Ootpa)
sh-4.4# uname -a
Linux kube-cotssgfw0jdq7e85d7sg-lsprototype-default-000002a3 4.18.0-513.24.1.el8_9.x86_64 #1 SMP Thu Mar 14 14:20:09 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux
sh-4.4# 

The failure does not occur on OpenShift 4.14.27 with nodes at:

sh-4.4# cat /etc/redhat-release 
Red Hat Enterprise Linux release 8.6 (Ootpa)
sh-4.4# uname -a
Linux worker0.paul-network-metrics.cp.fyre.ibm.com 5.14.0-284.66.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Mon May 6 14:51:27 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux
sh-4.4#
@jotak
Copy link
Member

jotak commented Sep 27, 2024

Hi @paulben ,

Thanks for reporting this issue. Do you know which CNI is implementing this rate limiting annotation? Is it calico? Asking because we've already been made aware of a limitation when a similar annotation was used with Calico while netobserv is used - there is a conflict with the eBPF programs. As far as I can tell, the program loaded by netobserv should support chaining with other BPF programs, but that might not be the case of the other one that is loaded.
We might also need to ask collaboration with the folk maintaining this upstream, if this is what I suspect.

cc @msherif1234 - we need to see if we must create an issue upstream in containernetworking.

@jotak
Copy link
Member

jotak commented Sep 27, 2024

@paulben do the 2 clusters that you mention have a similar network configuration regarding CNIs / multus?

@paulben
Copy link
Author

paulben commented Sep 27, 2024

@jotak On the failing cluster:

$ oc get network.config/cluster -o jsonpath='{.status.networkType}{"\n"}'
Calico

On the "working" cluster:

$ oc get network.config/cluster -o jsonpath='{.status.networkType}{"\n"}'
OVNKubernetes

I'm not sure how to get further cni/multus config. Can you advise?

@jotak
Copy link
Member

jotak commented Oct 3, 2024

I don't see anything we can do on our side, it's on calico side / container plugins to allow other probes to run.
But in openshift 4.16, the problem should be solved because it provides new TCx hooks that better handle this sort of conflict. Though we haven't tested with Calico & bandwidth annotations, but it would be good to check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants