
Like most cloud providers, Google does not charge users for networking traffic within a single availability zone. Google currently has 124 availability zones, which are spread out over 41 regions.
Packets traveling across zones, however, have to pay a gatekeeper. At last check, packets traveling between the U.S. and Europe cost $0.05 per GB, and Europe to/from Asia, $0.08. It can add up quickly.
By the nature of its monitoring service, Polar Signals has an immense amount of Kubernetes cross-zone network traffic. Polar Signals Cloud is a hosted service to store and analyze continuous profiling data of users’ systems.
Continuous profiling collects data on the current stack from every process in a network infrastructure. And though it uses a relatively low sampling rate, user data still builds up, quick. By mid-2024, cross-zone traffic accounted for nearly half of the company’s Google Cloud bill.
With no tools that could measure cross-zone Kubernetes traffic specifically, company engineers built a monitor that was instrumental in cutting cross-zone traffic bills by 50%, according to a case study posted Thursday by the eBPF Foundation.
The secret sauce? eBPF.
eBPF to the Rescue
At first, the company turned to commercial and open source network monitoring tools to characterize this traffic. For its commercial platform, the company uses Kubernetes and Cilium for container network management.
For Kubernetes, the kubectl has the top nodes command, which could deliver reports on network traffic (transmit and receive bytes) at the node level. Also, integrated into the Kubelet is cAdvisor, which collects data on network traffic.
None of these tools, however, could easily identify the network traffic crossing across multiple Google zones. Cilium itself could only report daily aggregated costs of network traffic, with no granularity at the pod or workload level.
So, the company built one itself, using eBPF.
The company was no stranger to eBPF. The company had already created memory tracker tool for its customers, called Parca, which uses eBPF.
eBPF is a new Linux kernel technology that has been generating interested across system providers over the past few years. It is, in effect a sandbox environment that runs within the Linux kernel with minimal overhead.
Small eBPF programs are triggered by hooks, or events, within the kernel, such as system calls or network events.
The initial requirements to build such a program consisted of the following:
- Accurately monitor and log cross-zone traffic,
- Integrate seamlessly with Kubernetes metadata,
- Provide real-time metrics.
Introducing Kubezonnet
The resulting open source program, Kubezonnet (KUBErnetes cross-ZONe NETwork monitoring), was released last week.
Built by four Polar Signals engineers, Kubezonnet monitors and measures Google cross-zone network traffic generated and consumed by Kubernetes clusters.
It was eBPF’s programmability, as well as its integration with Kubernetes metadata, that allowed Polar Signals to develop software specifically for monitoring pod traffic.
The software is deployed on Kubernetes, so installing it is as simple as applying the manifest. It requires Cilium CNI to be run in legacy host routing mode and a Linux Kernel 6.4 or above, for Netfilter, a Linux kernel network traffic filter introduced in that edition.
The software uses eBPF to trace network packets and aggregate traffic data.
Kubezonnet has two components: an agent and a server.
An agent is deployed on each node.
As the engineers explained, Kubezonnet uses Netfilter post-routing hooks to trace the network packets leaving pods, aggregating traffic data over 10-second intervals.
The collected data is sent to a central server, or set of servers, which resolves the source and destination IPs to Kubernetes pods and nodes. This is the process determines the zones of the nodes to identify cross-zone traffic.
The statistics are exposed as Prometheus metrics to monitor total cross-zone traffic by pod, as well as through flow logs to provide detailed insights into traffic patterns between specific pods.
Servers can be deployed on each cluster, or for each zone.
The software can produce metrics, such as the top 20 pods by cross-zone network traffic-per-second in the prior five minutes, as measured in megabytes.
It can also do cumulative amounts, such as the top 20 pods by cross-zone network traffic in the last week, as measured in gigabytes. Cumulatives such as this one can more useful in trying to suss why the cloud bill went through the roof.
How Polar Signals Cut Cross-Zone Traffic
Deploying Kubezonnet on its own network, Polar Signals identified areas where cross-zone traffic appeared to be excessive.
One culprit were the databases.
Database traffic “was previously hard to detect because many services interact with the main database and many services move a lot of bytes over the network,” wrote Polar Signals founder Frederic Branczyk, in a blog post explaining the technology.
The flow logs revealed a single workload dominating the traffic. Fixing this issue was trivial, Branczyk wrote.
The largest boundary busters, however, were the many rule evaluations that traveled across the company’s monitoring stack. A bit of re-engineering mitigated this traffic jam; engineers simply set up a Thanos stack in each zone, which not only cut traffic but improved the robustness of the monitoring system through redundancy.
Going forward, the company plans to set up an alert system so that engineers are notified when certain network thresholds are hit, preventing any surprisingly large cloud bills in the future. It also plans to build support for IPv6, as well add in even more traffic metrics.
Do you want to know how much of your Kubernetes traffic (expensively) travels across your cloud providers’ availability zones? Kubezonnet could help.
The post eBPF Tool Identifies Cross-Zone Kubernetes Network Traffic appeared first on The New Stack.
Polar Signals couldn't figure out why it had so much expensive cross-zone traffic. So it turned to eBPF to deliver the answers, and cut its cloud bill in half.