Skip to content

Commit

Permalink
docs: update the data-plane README.md
Browse files Browse the repository at this point in the history
The old README.md was from a long time ago when the data-plane
used to utilize XDP instead of TC, and was wildly out-of-date.

Signed-off-by: Shane Utt <[email protected]>
  • Loading branch information
shaneutt committed Aug 16, 2024
1 parent d011de2 commit bf21226
Show file tree
Hide file tree
Showing 2 changed files with 61 additions and 166 deletions.
2 changes: 1 addition & 1 deletion controlplane/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ https://github.com/kubernetes-sigs/blixt/discussions/150.
- [ ] TCPRoute reconciler
- [ ] UDPRoute reconciler

## Getting started
## Development

First you'll need to create a Kubernetes cluster (with `kind`):

Expand Down
225 changes: 60 additions & 165 deletions dataplane/README.md
Original file line number Diff line number Diff line change
@@ -1,191 +1,86 @@
# Some helpful hints for debugging this XDP program
# Blixt DataPlane

## Tracing XDP redirect (on first interface where main XDP program is attached)
In this directory you'll find the data-plane code for Blixt. The [extended
Berkeley Packet Filter (eBPF)][eBPF] available in the [Linux Kernel] is used as
the data-plane to support TCP and UDP ingress.

(TODO finish tracing the XDP path through the kernel)
1. Entry at `xdp_do_redirect`
- Frags Don't work `xdp_buff_has_frags`
- If map == XSKMAP -> `__xdp_do_redirect_xsk`
- Returns `__xdp_do_redirect_frame`
[eBPF]:https://www.kernel.org/doc/html/latest/bpf/index.html
[Linux Kernel]:https://www.kernel.org/

2. Entry `__xdp_do_redirect_frame` (Can't trace internal functions?)
## Overview

In this directory you'll find the following sub-directories:

## Tracing Once packet meets host end of veth
* `common` - shared libraries with common types and tools
* `ebpf` - this is where [eBPF] code lives which programs routes into the Kernel
* `loader` - a userland program which loads the eBPF code into the Kernel
* `api-server` - a [gRPC] API where configuration changes are pushed

(TODO finish tracing the XDP path through the kernel)
__netif_receive_skb_core
This enables serving TCP and UDP ingress traffic.

The data-plane normally is programmed by pairing it with the control-plane,
using [TCPRoute] and [UDPRoute] APIs in Kubernetes, however as is mentioned
above it is possible to program it directly via the gRPC client if needed for
development/testing.

## Debugging UDP Checksum issues
> **Note**: Before the gRPC API the data-plane used to pull configuration from
> the Kubernetes API instead of the configuration being pushed. The current way
> of doing things was done because it helped make development and debugging
> easier in the interim, but we expect in time to drop the gRPC API and move
> back to a Kubernetes controller design.
We can use TCP dump see if cksum's are correct once the packets reach the container:
[eBPF]:https://www.kernel.org/doc/html/latest/bpf/index.html
[gRPC]:https://grpc.io/
[TCPRoute]:https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io/v1alpha2.TCPRoute
[UDPRoute]:https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io/v1alpha2.UDPRoute

```bash
sudo tcpdump -vvv -i <Container Interface> -neep udp`
## Development

First you'll need to create a Kubernetes cluster (with `kind`):

```console
make build.cluster
```

`__sum16 __skb_checksum_complete(struct sk_buff *skb)` is the name of the kernel
function which will actually check the cksum, it can be tracked with `bpftrace`
and the following kprobe:
With that cluster from here on, you can make your changes locally, and then
build and push those changes to the cluster with:

```bash
kretprobe:__skb_checksum_complete
{
printf("skb_checksum_complete returned: %x\n", retval);
}
```console
make load.image.dataplane TAG=latest
```

## Manually Calculating UDP Checksums
This will build the container image, and load it into the cluster.

A UDP cksum is calculated with the following:
Then deploy the manifest, which will create the `DaemonSet` which uses the
image you just loaded in the cluster:

```bash
1's Complement {
Source IP +
Destination IP +
17 (0x0011 - UDP protocol code) +
UDP Packet Length + Source Port +
Destination Port +
UDP Packet Length +
Data
}
```console
kubectl kustomize config/dataplane | kubectl apply -f -
```

A Raw TCPdump packet is shown below:
```bash
13:23:15.756911 06:56:87:ec:fd:1f > 86:ad:33:29:ff:5e, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 57, id 20891, offset 0, flags [DF], proto UDP (17), length 33)
10.8.125.12.58980 > 192.168.10.2.sapv1: [bad udp cksum 0xd301 -> 0xaf43!] UDP, length 5
0x0000: 86ad 3329 ff5e 0656 87ec fd1f 0800 4500
0x0010: 0021 519b 4000 3911 9e72 0a08 7d0c c0a8
0x0020: 0a02 e664 2693 000d d301 7465 7374 0a00
0x0030: 0000 0000 d2f2 935d 0000 0000
```
From here on out, any time you want to push your new changes to the cluster
all you have to do is re-run:

Using this along with our knowledge of a UDP packet we can quickly and manually
calculate the cksum like so:
```bash
0x0a08 Src IP octet 1
0x7d0c Src IP octet 2
0xc0a8 Dst IP octet 1
0x0a02 Dst IP octet 2
0x0011 Proto
0x000d Length
0xe664 Src Port
0x2693 Dst Port
0x000d Length
0x7465 Data
0x7374 Data
0x0a00 Data
+
-------------
50bc -> 1's compliment = af43
```console
make load.image.dataplane TAG=latest
```

To play with this same raw data in wireshark we can use the text from the hex dump
and convert it to the following format. With this in a file you can then
"Import from hex dump" in wireshark.
This will build the image, load the image, and perform a rollout to restart
the `Pods` with the new image.

```bash
13:23:15
0000 86 ad 33 29 ff 5e 06 56 87 ec fd 1f 08 00 45 00
0010 00 21 51 9b 40 00 39 11 9e 72 0a 08 7d 0c c0 a8
0020 0a 02 e6 64 26 93 00 0d d3 01 74 65 73 74 0a 00
0030 00 00 00 00 d2 f2 93 5d 00 00 00 00
```
To push test configurations to the data-plane you can use the [xtask] provided
in this directory which includes a `grpc-client` command for manually sending
data-plane configuration to the data-plane's [gRPC] API.

To view the documentation for this, run:

![Above Raw packet shown in wireshark](./wireshark.png)

## Tracing Non XDP stack (native kernel) with PWRU

Cilium's [PWRU](https://github.com/cilium/pwru) is a great tool for tracing packets
as they make their way through the linux kernel. It is limited in the fact that it
doesn't really track the XDP stack currently, however it's still super helpful
for debugging other issues.
### Working Trace (manually re-writing Cksums)
```bash
0xffff96d3956d4f00 8 [ksoftirqd/8] udp4_gro_receive
0xffff96d3956d4f00 8 [ksoftirqd/8] udp_gro_receive
0xffff96d3956d4f00 8 [ksoftirqd/8] skb_defer_rx_timestamp
0xffff96d3956d4f00 8 [ksoftirqd/8] tpacket_rcv
0xffff96d3956d4f00 8 [ksoftirqd/8] skb_push
0xffff96d3956d4f00 8 [ksoftirqd/8] tpacket_get_timestamp
0xffff96d3956d4f00 8 [ksoftirqd/8] consume_skb
0xffff96d3956d4f00 10 [nc] skb_consume_udp
0xffff96d3956d4f00 10 [nc] skb_consume_udp
0xffff96d3956d4f00 10 [nc] __consume_stateless_skb
0xffff96d3956d4f00 10 [nc] skb_release_data
0xffff96d3956d4f00 10 [nc] skb_free_head
0xffff96d3956d4f00 10 [nc] kfree_skbmem
0xffff96d3956d4f00 8 [ksoftirqd/8] ip_rcv_core
0xffff96d3956d4f00 8 [ksoftirqd/8] pskb_trim_rcsum_slow
0xffff96d3956d4f00 8 [ksoftirqd/8] udp_v4_early_demux
0xffff96d3956d4f00 8 [ksoftirqd/8] ip_route_input_noref
0xffff96d3956d4f00 8 [ksoftirqd/8] ip_route_input_rcu
0xffff96d3956d4f00 8 [ksoftirqd/8] ip_route_input_slow
0xffff96d3956d4f00 8 [ksoftirqd/8] fib_validate_source
0xffff96d3956d4f00 8 [ksoftirqd/8] __fib_validate_source
0xffff96d3956d4f00 8 [ksoftirqd/8] ip_local_deliver
0xffff96d3956d4f00 8 [ksoftirqd/8] ip_local_deliver_finish
0xffff96d3956d4f00 8 [ksoftirqd/8] ip_protocol_deliver_rcu
0xffff96d3956d4f00 8 [ksoftirqd/8] raw_local_deliver
0xffff96d3956d4f00 8 [ksoftirqd/8] udp_rcv
0xffff96d3956d4f00 8 [ksoftirqd/8] __udp4_lib_rcv
0xffff96d3956d4f00 8 [ksoftirqd/8] __skb_checksum_complete
0xffff96d3956d4f00 8 [ksoftirqd/8] udp_unicast_rcv_skb
0xffff96d3956d4f00 8 [ksoftirqd/8] udp_queue_rcv_skb
0xffff96d3956d4f00 8 [ksoftirqd/8] udp_queue_rcv_one_skb
0xffff96d3956d4f00 8 [ksoftirqd/8] sk_filter_trim_cap
0xffff96d3956d4f00 8 [ksoftirqd/8] security_sock_rcv_skb
0xffff96d3956d4f00 8 [ksoftirqd/8] selinux_socket_sock_rcv_skb
0xffff96d3956d4f00 8 [ksoftirqd/8] selinux_sock_rcv_skb_compat
0xffff96d3956d4f00 8 [ksoftirqd/8] selinux_netlbl_sock_rcv_skb
0xffff96d3956d4f00 8 [ksoftirqd/8] selinux_xfrm_sock_rcv_skb
0xffff96d3956d4f00 8 [ksoftirqd/8] bpf_lsm_socket_sock_rcv_skb
```console
cargo xtask grpc-client --help
```

### Working Trace (ignoring cksums i.e setting to 0)
```bash
0xffff96d35c18f000 8 [<empty>] udp4_gro_receive
0xffff96d35c18f000 8 [<empty>] udp_gro_receive
0xffff96d35c18f000 8 [<empty>] skb_defer_rx_timestamp
0xffff96d35c18f000 8 [<empty>] tpacket_rcv
0xffff96d35c18f000 8 [<empty>] skb_push
0xffff96d35c18f000 8 [<empty>] tpacket_get_timestamp
0xffff96d35c18f000 10 [nc] skb_consume_udp
0xffff96d35c18f000 10 [nc] skb_consume_udp
0xffff96d35c18f000 10 [nc] __consume_stateless_skb
0xffff96d35c18f000 10 [nc] skb_release_data
0xffff96d35c18f000 10 [nc] skb_free_head
0xffff96d35c18f000 10 [nc] kfree_skbmem
0xffff96d35c18f000 8 [<empty>] consume_skb
0xffff96d35c18f000 8 [<empty>] ip_rcv_core
0xffff96d35c18f000 8 [<empty>] pskb_trim_rcsum_slow
0xffff96d35c18f000 8 [<empty>] udp_v4_early_demux
0xffff96d35c18f000 8 [<empty>] ip_route_input_noref
0xffff96d35c18f000 8 [<empty>] ip_route_input_rcu
0xffff96d35c18f000 8 [<empty>] ip_route_input_slow
0xffff96d35c18f000 8 [<empty>] fib_validate_source
0xffff96d35c18f000 8 [<empty>] __fib_validate_source
0xffff96d35c18f000 8 [<empty>] ip_local_deliver
0xffff96d35c18f000 8 [<empty>] ip_local_deliver_finish
0xffff96d35c18f000 8 [<empty>] ip_protocol_deliver_rcu
0xffff96d35c18f000 8 [<empty>] raw_local_deliver
0xffff96d35c18f000 8 [<empty>] udp_rcv
0xffff96d35c18f000 8 [<empty>] __udp4_lib_rcv # ----> No CKSUM so we don't call __skb_checksum_complete
0xffff96d35c18f000 8 [<empty>] udp_unicast_rcv_skbx_
0xffff96d35c18f000 8 [<empty>] udp_queue_rcv_skb
0xffff96d35c18f000 8 [<empty>] udp_queue_rcv_one_skb
0xffff96d35c18f000 8 [<empty>] sk_filter_trim_cap
0xffff96d35c18f000 8 [<empty>] security_sock_rcv_skb
0xffff96d35c18f000 8 [<empty>] selinux_socket_sock_rcv_skb
0xffff96d35c18f000 8 [<empty>] selinux_sock_rcv_skb_compat
0xffff96d35c18f000 8 [<empty>] selinux_netlbl_sock_rcv_skb
0xffff96d35c18f000 8 [<empty>] selinux_xfrm_sock_rcv_skb
0xffff96d35c18f000 8 [<empty>] bpf_lsm_socket_sock_rcv_skb
0xffff96d35c18f000 8 [<empty>] skb_pull_rcsum
```
> **Note**: You can alternatively deploy the control-plane to develop and test
> as well, which is helpful anyhow as any changes made here need to be
> reflected in the control-plane code eventually anyway.
[xtask]:https://docs.rs/xtasks/latest/xtasks/
[gRPC]:https://grpc.io/

0 comments on commit bf21226

Please sign in to comment.