=================
== CJ Virtucio ==
=================

Cilium (Part 1)

cilium kubernetes

Introduction

This the beginning of a series explaining the Cilium CNI plugin. Part 1 will explain what Cilium is and how it works. Part 2 will discuss its implementation of the Gateway API. Finally, part 3 will talk about securing your cluster with network policies.

What is Cilium?

Cilium is a plugin that implements the Container Network Interface (CNI). Like every CNI plugin, it governs how traffic is managed in a Kubernetes cluster.

An eBPF technology

What makes Cilium stand out is that it is an eBPF technology. eBPF stands for Extended Berkley Packet Filter. It is a way to extend the Linux kernel without patching it or loading modules. eBPF programs are written in C, compiled into eBPF bytecode, then loaded into the kernel. The programs go through a validation process before allowed to run within the kernel. There are several benefits to eBPF, the most notable of which are the validation process and access to efficient data structures like maps.

How does Cilium use eBPF?

Cilium manages traffic by segregating a network into endpoints. Traffic between endpoints is governed by network policies. When a policy is added by creating a custom resource (CR), Cilium will translate its rules into state represented by eBPF maps within the kernel. The maps are then accessed by the eBPF programs for each endpoint, also compiled and loaded by Cilium. All of this happens through a combination of an operator (for managing the CRs), a daemon, and an Envoy proxy (for dealing with L7 traffic forwarded to it by the eBPF programs).

Key components

Operator

The operator manages the statuses of CRs.

Daemon

The daemon listens for changes to the CRs and tries to “regenerate” eBPF state by compiling the CRs’ rules into maps. The success or failure of this regeneration will prompt the operator to update the CRs’ statuses accordingly.

The daemon also segregates the network into “endpoints”, assigning a numeric identity to each one. Some get labels based on the ones they already have as Kubernetes resources. Others such as the Ingress proxy and the Kubernetes host get reserved labels like ingress and host.

The daemon handles L3 and L4 traffic.

Ingress proxy

The eBPF programs that the daemon compiles and loads into the kernel are insufficient for complex L7 traffic. L7 traffic is forwarded by the eBPF programs to an Ingress proxy (Envoy) started alongside the daemon on the same Kubernetes pod. The daemon translates L7-related state into config files for Envoy to use.

How does it work?

Here is a diagram summarizing some of Cilium daemon’s relevant components:

cilium daemon

The daemon will handle API calls made by the CNI plugin to the REST API. Each call makes the daemon execute a certain function. This function enqueues an event which eventually results in a method call starting the BPF state regeneration process.

In state regeneration, the maps are updated on the kernel through a process of “pinning”. Pinning consists of mounting a file unto a path on the filesystem (a file within a subfolder nested under /sys/fs/bpf/tc/globals, in Cilium’s case):

% kubectl --namespace cilium exec --stdin -c cilium-agent cilium-dsztz -- ls /sys/fs/bpf/tc/globals/
cilium_auth_map
cilium_call_policy
cilium_calls_00152
cilium_calls_00322
cilium_calls_00348
cilium_calls_00369
cilium_calls_00406

The file will appear empty from the perspective of userspace applications:

% kubectl --namespace cilium exec --stdin -c cilium-agent cilium-dsztz -- stat /sys/fs/bpf/tc/globals/cilium_policy_02834
  File: /sys/fs/bpf/tc/globals/cilium_policy_02834
  Size: 0         	Blocks: 0          IO Block: 4096   regular empty file
Device: 1bh/27d	Inode: 144954474   Links: 1
Access: (0600/-rw-------)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2024-03-28 19:27:48.556132927 +0000
Modify: 2024-03-28 19:27:48.533132217 +0000
Change: 2024-03-28 19:27:48.533132217 +0000
 Birth: -

In reality, the file actually has maps and similar data structures only visible to the kernel. Meanwhile, the eBPF programs that will consume these maps are compiled/re-compiled and loaded into the kernel. They will be placed in each endpoint’s folder within the /var/run/cilium/state:

% kubectl --namespace cilium exec --stdin -c cilium-agent cilium-dsztz -- ls /var/run/cilium/state/152/
bpf_lxc.o
ep_config.h
template.o

These programs will appear as executable and linkable format (ELF) files.

The eBPF programs are only sufficient for L3/L4 traffic. L7 traffic will be redirected to the envoy proxy, which will manage this traffic using envoy configuration files generated by the daemon.

That concludes part 1 of this series. This hopefully demystifies Cilium’s implementation details. The next article in this series will explain how to use Cilium’s Gateway API implementation.