Skip to content
This repository has been archived by the owner on Mar 20, 2024. It is now read-only.

Feat kind support map pinning v3 #67

Merged
merged 20 commits into from
Oct 17, 2023

Conversation

maryamtahhan
Copy link
Contributor

Rebasing the previous PR to support bpf map pinning after Kind support was merged to main.

Going to close PR 59

Will transition from draft after some local testing.

@maryamtahhan
Copy link
Contributor Author

maryamtahhan commented Jul 18, 2023

Tested with a kind cluster make run-on-kind

configure the unprivileged_bpf_disabled kernel flag on the kind worker nodes

$ docker exec af-xdp-deployment-worker sysctl kernel.unprivileged_bpf_disabled=0
kernel.unprivileged_bpf_disabled = 0
$ docker exec af-xdp-deployment-worker2 sysctl kernel.unprivileged_bpf_disabled=0

Used the following NAD:

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
 name: afxdp-network
 annotations:
   k8s.v1.cni.cncf.io/resourceName: afxdp/myPool
spec:
 config: '{
     "cniVersion": "0.3.0",
     "type": "afxdp",
     "mode": "primary",
     "logFile": "afxdp-cni.log",
     "logLevel": "debug",
     "dpSyncer": true,
     "ipam": {
       "type": "host-local",
       "subnet": "192.168.1.0/24",
       "rangeStart": "192.168.1.200",
       "rangeEnd": "192.168.1.220",
       "routes": [
         { "dst": "0.0.0.0/0" }
       ],
       "gateway": "192.168.1.1"
     }
   }'

and the following pod spec:

apiVersion: v1
kind: Pod
metadata:
 name: cndp-0-0
 annotations:
   k8s.v1.cni.cncf.io/networks: afxdp-network 
spec:
 containers:
   - name: cndp-0
     command: ["/bin/bash"]
     args: ["-c", "./jsonc_gen.sh -kp ; cndpfwd -c config.jsonc lb;"]
     image:  quay.io/mtahhan/cndp-map-pinning:latest 
     imagePullPolicy: IfNotPresent
     securityContext:
       capabilities:
         add:
           - NET_RAW
           - IPC_LOCK
     resources:
       requests:
         afxdp/myPool: '1'
       limits:
         afxdp/myPool: '1'

Also need to load the container image to the kind workers:

$ kind load --name af-xdp-deployment docker-image quay.io/mtahhan/cndp-map-pinning:latest 

Then creating and deleting the cndp pod - the logs of the Device plugin are updated accordingly with bpf map pinning messages/information

the cndp pod log itself should show:

**** PINNED_BPF_MAP is enabled
libbpf: can't get next link: Operation not permitted

*** CNDPFWD Forward Application, API: XSKDEV, Mode: Loopback, Burst Size: 256 
   Initial Thread ID    1 on lcore 1
   Forwarding Thread ID 28 on lcore 0

DP logs on pod creation:

DEBU[2023-07-18 12:23:24] [poolManager.go:220] [Allocate] Primary mode                                 
DEBU[2023-07-18 12:23:24] [poolManager.go:232] [Allocate] Cycling state of device veth12               
INFO[2023-07-18 12:23:24] [poolManager.go:250] [Allocate] Loading BPF program on device: veth12 and pinning the map 
INFO[2023-07-18 12:23:24] [mapManager.go:163] [CreateBPFFS] created a directory /var/run/afxdp_dp/afxdp-maps/6c1eab5d-e6ba-472a-8eff-ce90a45481bc 
INFO[2023-07-18 12:23:24] [mapManager.go:168] [CreateBPFFS] Created BPFFS mount point at /var/run/afxdp_dp/afxdp-maps/6c1eab5d-e6ba-472a-8eff-ce90a45481bc 
INFO[2023-07-18 12:23:24] [bpfWrapper.go:135] [Infof] Load_bpf_pin_xsk_map: if_index for interface veth12 is 10 
libbpf: Error in bpf_create_map_xattr(xsks_map):No error information(-524). Retrying without BTF.
INFO[2023-07-18 12:23:24] [bpfWrapper.go:135] [Infof] Load_bpf_pin_xsk_map: bpf: Attach prog to ifindex 10 
INFO[2023-07-18 12:23:24] [bpfWrapper.go:135] [Infof] Load_bpf_pin_xsk_map: xsk map pinned to /var/run/afxdp_dp/afxdp-maps/6c1eab5d-e6ba-472a-8eff-ce90a45481bc/xsks_map 
DEBU[2023-07-18 12:23:24] [poolManager.go:267] [Allocate] mapping /var/run/afxdp_dp/afxdp-maps/6c1eab5d-e6ba-472a-8eff-ce90a45481bc/xsks_map to /tmp/xsks_map 
DEBU[2023-07-18 12:23:24] [poolManager.go:289] [Allocate] Container environment variables: {
  "AFXDP_DEVICES": "veth12"
} 

DP plugin on pod deletion:

DEBU[2023-07-18 12:23:24] [poolManager.go:220] [Allocate] Primary mode                                 
DEBU[2023-07-18 12:23:24] [poolManager.go:232] [Allocate] Cycling state of device veth12               
INFO[2023-07-18 12:23:24] [poolManager.go:250] [Allocate] Loading BPF program on device: veth12 and pinning the map 
INFO[2023-07-18 12:23:24] [mapManager.go:163] [CreateBPFFS] created a directory /var/run/afxdp_dp/afxdp-maps/6c1eab5d-e6ba-472a-8eff-ce90a45481bc 
INFO[2023-07-18 12:23:24] [mapManager.go:168] [CreateBPFFS] Created BPFFS mount point at /var/run/afxdp_dp/afxdp-maps/6c1eab5d-e6ba-472a-8eff-ce90a45481bc 
INFO[2023-07-18 12:23:24] [bpfWrapper.go:135] [Infof] Load_bpf_pin_xsk_map: if_index for interface veth12 is 10 
libbpf: Error in bpf_create_map_xattr(xsks_map):No error information(-524). Retrying without BTF.
INFO[2023-07-18 12:23:24] [bpfWrapper.go:135] [Infof] Load_bpf_pin_xsk_map: bpf: Attach prog to ifindex 10 
INFO[2023-07-18 12:23:24] [bpfWrapper.go:135] [Infof] Load_bpf_pin_xsk_map: xsk map pinned to /var/run/afxdp_dp/afxdp-maps/6c1eab5d-e6ba-472a-8eff-ce90a45481bc/xsks_map 
DEBU[2023-07-18 12:23:24] [poolManager.go:267] [Allocate] mapping /var/run/afxdp_dp/afxdp-maps/6c1eab5d-e6ba-472a-8eff-ce90a45481bc/xsks_map to /tmp/xsks_map 
DEBU[2023-07-18 12:23:24] [poolManager.go:289] [Allocate] Container environment variables: {
  "AFXDP_DEVICES": "veth12"
} 
INFO[2023-07-18 12:27:16] [server.go:66] [DelNetDev] Looking up Map Manager for veth12            
INFO[2023-07-18 12:27:16] [server.go:83] [DelNetDev] Map Manager found, deleting BPFFS for veth12 
INFO[2023-07-18 12:27:16] [mapManager.go:293] [DeleteBPFFS] Deleted BPFFS mount point at /var/run/afxdp_dp/afxdp-maps/6c1eab5d-e6ba-472a-8eff-ce90a45481bc 
INFO[2023-07-18 12:27:16] [server.go:90] [DelNetDev] Network interface veth12 deleted 

@maryamtahhan maryamtahhan marked this pull request as ready for review July 18, 2023 12:28
@maryamtahhan
Copy link
Contributor Author

I just spotted the

libbpf: can't get next link: Operation not permitted

this is not expected... but doesn't block this PR at least... Let me see if CAP_BPF is the issue here... it should be either CAP_BPF or unprivileged_bpf_disabled...

@maryamtahhan
Copy link
Contributor Author

maryamtahhan commented Jul 18, 2023

** TLDR ===> not a blocker for this PR**
Ok - I think we can ignore that warning...
libbpf/libbpf@8628610c322a it looks like it's a probe under the hood of libbpf and when bpf link support isn't detected it reverts back to netlink-based XDP prog... we indeed don't have permissions to make this bpf call is the pod is unprivileged...
I did notice however that even CAP_BPF didn't provide enough privilege for this call, only when the pod is privileged is it able to make that call.

@garyloug
Copy link
Contributor

Hey @maryamtahhan, we've seen libbpf: can't get next link: Operation not permitted before. It's pod privileges we think, running as root fixed it, but obviously that's not the solution.

ok so it's a probe for libbpf rather than breaking functionality? Or is it breaking functionality?

Is it something we could ask the DP to configure for us?

Kind of related:
Client being created in #65 and we'll put a C wrapper on this once finalised.

@maryamtahhan
Copy link
Contributor Author

Hey @maryamtahhan, we've seen libbpf: can't get next link: Operation not permitted before. It's pod privileges we think, running as root fixed it, but obviously that's not the solution.

ok so it's a probe for libbpf rather than breaking functionality? Or is it breaking functionality?

Yeah - it's an internal probe under the hood of libbpf :( it doesn't break functionality from what I can see. CNDP can still successfully create the AF_XDP socket and doesn't fail.

Is it something we could ask the DP to configure for us?

I don't think so.

Kind of related: Client being created in #65 and we'll put a C wrapper on this once finalised.

Cool, I will check it out.

@johnoloughlin
Copy link

capabilities don't get added to a non root users shell. You need to use setcap on the specific binary that needs the capability in the dockerfile (you cant do it in a running container). Then you also need to have the matching capability in the pod spec.
you can easily check the capabilities of the current shell with capsh --print.
getcap can be used to check specific capability of a binary

@maryamtahhan
Copy link
Contributor Author

capabilities don't get added to a non root users shell. You need to use setcap on the specific binary that needs the capability in the dockerfile (you cant do it in a running container). Then you also need to have the matching capability in the pod spec. you can easily check the capabilities of the current shell with capsh --print. getcap can be used to check specific capability of a binary

sorry, what's the context here?

@johnoloughlin
Copy link

My mistake, I thought that you were running as a non root user from grays comment:
"libbpf: can't get next link: Operation not permitted before. It's pod privileges we think, running as root fixed it"
such as https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/test/e2e/pod-1c1d.yaml#L9 where you runAsUser: 1500 or that you had baked the non root user into the dockerfile.
I realise now that he meant unprivileged root user

garyloug
garyloug previously approved these changes Oct 12, 2023
Signed-off-by: Maryam Tahhan <[email protected]>
Note gRPC is implemented over UDS at this point. Next step
is to look into mTLS between the CNI and the DP.

Signed-off-by: Maryam Tahhan <[email protected]>
Signed-off-by: Maryam Tahhan <[email protected]>
Signed-off-by: Maryam Tahhan <[email protected]>
Signed-off-by: Maryam Tahhan <[email protected]>
Signed-off-by: Maryam Tahhan <[email protected]>
Signed-off-by: Maryam Tahhan <[email protected]>
Signed-off-by: Maryam Tahhan <[email protected]>
Signed-off-by: Maryam Tahhan <[email protected]>
Signed-off-by: Maryam Tahhan <[email protected]>
@maryamtahhan
Copy link
Contributor Author

I've rebased on main and tested in Kind... everything is working as expected.

Signed-off-by: Maryam Tahhan <[email protected]>
@patrickog11 patrickog11 merged commit 0ccf674 into intel:main Oct 17, 2023
6 checks passed
@maryamtahhan maryamtahhan deleted the feat_kind_support_map_pinning_v3 branch May 13, 2024 13:15
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants