Conversation
0df8cbd to
f1bcf41
Compare
SRodi
left a comment
There was a problem hiding this comment.
Thanks for driving this @nddq I agree the IG incompatibility forces us to move DNS/tcpretrans off inspektor-gadget as part of getting to Cilium v1.19.2.
One suggestion to make this easier to review/debug (without overcomplicating rollout): could we split this into two PRs, but keep the plugin names the same and do a straight replacement (no dual-plugin naming / feature flags)?
Proposed split:
- PR: Replace the existing
dns+tcpretransplugins with the ebpf-native implementations and remove the inspektor-gadget dependency (keep external behavior/plugin names the same). - PR: Upgrade
github.com/cilium/ciliumto v1.19.2 (and any required Go/tooling + Hubble/operator adaptation changes).
Rationale: this isolates the “plugin implementation change” from the broader Cilium/control-plane dependency churn, making review and failure attribution simpler. Rollback is still straightforward at the Retina release level if needed (we can do a release straight after the plugins change), so I don’t think we need to introduce extra flags or temporary plugin naming.
|
@SRodi Sounds good! I'll make that change |
# Description **PR 2 of 2** for removing inspektor-gadget from retina plugins (split from #2148). Replaces the IG-based DNS tracer with a native eBPF socket filter and adds comprehensive kernel-level and Go unit tests. - **BPF program** (`dns.c`): Socket filter on AF_PACKET captures DNS queries/responses (UDP + TCP, IPv4 + IPv6). Uses `bpf_skb_load_bytes` helpers compatible with `BPF_PROG_TEST_RUN`. Query type extracted in BPF via QNAME walking. - **Per-interface dedup filter**: Socket binds to all interfaces (`ifindex=0`) with a BPF-side filter that only captures `PACKET_HOST` on non-default interfaces (veths) while capturing both directions on the default interface (eth0). This avoids double-counting while preserving visibility into same-node pod-to-CoreDNS and NodeLocal DNS traffic. - **Go plugin** (`dns_linux.go`): Perf buffer event loop with gopacket DNS parsing. Populates `dns_config` BPF map with default interface index at Init. No runtime compilation — pre-compiled via `bpf2go@v0.18.0`. - **Test infrastructure** (`ebpftest/packet.go`): `TCPPacketOpts.Payload`, IPv6 `BuildUDPPacket`, DNS packet builders (`BuildDNSQueryPacket`, `BuildDNSResponsePacket`, `BuildDNSTCPQueryPacket`). - **eBPF + unit tests**: 28 tests covering IPv4/IPv6, TCP/UDP, mDNS, edge cases, counter deltas, mock enricher, and `parseDNSPayload`. ## Related Issue Partial fix for #1788 Split from #2148 ## Checklist - [x] I have read the [contributing documentation](https://retina.sh/docs/Contributing/overview). - [x] I signed and signed-off the commits (`git commit -S -s ...`). - [x] I have correctly attributed the author(s) of the code. - [x] I have tested the changes locally. - [x] I have followed the project's style guidelines. - [ ] I have updated the documentation, if necessary. - [x] I have added tests, if applicable. ## Testing Completed All 28 tests pass (14 eBPF + 14 unit). Validated on AKS cluster (`retinaTest-v119`, 3-node, k8s v1.33.6): | Scenario | Result | |---|---| | Pod → CoreDNS same node | Captured, count=1 per query | | Pod → CoreDNS different node | Captured, count=1 per query | | NodeLocal DNS (169.254.20.10) | Both hops captured (pod→cache, cache→CoreDNS) — expected | | Multi-NIC | Not tested (single-NIC AKS nodes) | | Double-counting | None detected | Hubble and standard mode DNS metrics flowing with correct pod-level labels. ## Additional Notes - After both plugin PRs merge, inspektor-gadget can be fully removed in the Cilium v1.19 upgrade PR - The old IG tracer used `ifindex=0` with no dedup, which caused double-counting on multi-interface paths. The new BPF dedup filter resolves this while maintaining same-node and NodeLocal DNS visibility. --- Please refer to the [CONTRIBUTING.md](../CONTRIBUTING.md) file for more information on how to contribute to this project. Signed-off-by: Quang Nguyen <nguyenquang@microsoft.com>
# Description **PR 1 of 2** for removing inspektor-gadget from retina plugins (split from #2148 per review feedback) Cilium v1.19.2 pins `cilium/ebpf` at a version that removed `CollectionSpec.RewriteConstants()`, which inspektor-gadget relies on. Upgrading IG is not viable — v0.42.0+ removed the built-in `trace/tcpretrans` gadget entirely. This PR replaces the IG-based implementation with a native cilium/ebpf one ahead of the Cilium upgrade. - **BPF program**: New `tracepoint/tcp/tcp_retransmit_skb` program (kernel 5.8+ for `bpf_ktime_get_boot_ns`) with `BPF_CORE_READ` for CO-RE field relocation. Handles the kernel 6.17 tracepoint struct rename via CO-RE type flavors. TCP flags read from `tcp_skb_cb`. Supports IPv4 and IPv6. - **Go plugin**: Perf buffer event loop with worker goroutines, replacing the IG tracer/gadget-context pattern. `SetupChannel` now wires up the external channel (was a no-op under IG), enabling the advanced metrics module to receive events directly. - **Build**: Pre-compiled via `bpf2go@v0.18.0` with per-arch targets (amd64 + arm64) and embedded in the binary — no runtime compilation. `Compile()` and `Generate()` retained as no-ops for plugin manager lifecycle compatibility. - **`ToFlow` IPv6 labeling** (`pkg/utils/flow_utils.go`): now derives `IpVersion` from `sourceIP.To4()` instead of hardcoding `IPVersion_IPv4`, so IPv6 retransmission flows are labeled correctly. Backward-compatible for every existing IPv4 caller. - **No dependency changes**: Works with existing `cilium/ebpf` v0.18.0 in `go.mod`. ## Related Issue Partial fix for #1788 Split from #2148 Related: #2162 (chart fix surfaced while validating this PR — latent helm-upgrade reload bug for the operator ConfigMap) ## Checklist - [x] I have read the [contributing documentation](https://retina.sh/docs/Contributing/overview). - [x] I signed and signed-off the commits (`git commit -S -s ...`). - [x] I have correctly attributed the author(s) of the code. - [x] I have tested the changes locally. - [x] I have followed the project's style guidelines. - [x] I have updated the documentation, if necessary. - [x] I have added tests, if applicable. ## Testing Completed ### Standard / Advanced mode (AKS cluster `retinaTest-v119`, 3-node, k8s v1.33.6, kernel 5.15.0-1102-azure) Image: `acnpublic.azurecr.io/microsoft/retina/retina-agent:f4adc1e0-fix1-linux-amd64` Plugin initialized and attached on all three agent pods: ``` level=info caller=common/common_linux.go:79 msg="perf reader created" Map=PerfEventArray(retina_tcpretrans_events)#100 PageSize=4096 BufferSize=65536 level=info caller=tcpretrans/tcpretrans_linux.go:106 msg="tcpretrans plugin initialized" level=info caller=pluginmanager/pluginmanager.go:174 msg="starting plugin tcpretrans" ``` `BufferSize=65536` = 4096 × 16 pages, confirming the starting buffer size is live (down from an initial oversized default). Advanced metric flowing with real pod-level labels, counter incrementing in real time: ``` networkobservability_adv_tcpretrans_count{direction="egress",ip="10.224.1.183",namespace="kube-system",podname="konnectivity-agent-5858c6b5d7-882m2"} 6 → 7 → 9 networkobservability_adv_tcpretrans_count{direction="egress",ip="10.224.2.41",namespace="kube-system",podname="metrics-server-b957f9d87-tg22p"} 1 ``` Cross-checked against the node-level TCP stats from linuxutil on the same pod: `networkobservability_tcp_connection_stats{statistic_name="TCPLostRetransmit"} = 55` — different data source (kernel TCP stats vs kernel tracepoint), same order of magnitude. ### Advanced mode with remote context (source + destination labels) With `remoteContext=true` and `destinationLabels` set on the `tcp_retransmission_count` entry in the `MetricsConfiguration` CRD: ``` networkobservability_adv_tcpretrans_count{ source_ip="10.224.1.183",source_namespace="kube-system",source_podname="konnectivity-agent-5858c6b5d7-882m2", destination_ip="51.143.116.92",destination_namespace="kubernetes-apiserver",destination_podname="kubernetes-apiserver", direction="EGRESS" } 4 ``` No plugin changes were needed for remote context — the rewrite already populates both sides of the flow through `utils.ToFlow`, the enricher handles both sides, and the metrics module is already wired to emit both label sets when configured. Discovering this path is what led to filing #2162. ### Build / vet / test ```bash go build ./... # clean go vet ./pkg/utils/... ./pkg/plugin/tcpretrans/... # clean gofumpt -l pkg/utils/flow_utils.go pkg/plugin/tcpretrans/ # clean go test -tags=unit,dashboard ./pkg/plugin/... ./pkg/module/... # all pass ``` All existing `utils.ToFlow` callers (dropreason, dns, packetparser, latency_test.go) still pass — the `IpVersion` derivation is backward-compatible for every IPv4 caller. ## Additional Notes - PR 2 of 2 (#2153, `plugin/dns-ebpf`) rewrites the DNS plugin with the same approach. - After both merge, inspektor-gadget can be fully removed in the Cilium v1.19 upgrade PR. Signed-off-by: Quang Nguyen <nguyenquang@microsoft.com>
4be36a4 to
bddee40
Compare
Signed-off-by: Quang Nguyen <nguyenquang@microsoft.com>
bddee40 to
92771e9
Compare
Description
Upgrade Cilium from v1.18.0-pre.1 to v1.19.3 and adapt the codebase for all breaking API changes.
This PR is now a single squashed commit covering the dependency upgrade and all required adaptations. The DNS and tcpretrans plugin rewrites that were previously part of this branch have already been merged separately (#2152, #2153).
Changes by area
Logging (logrus → slog)
slog.Handlerandlogr.Loggerbridges inpkg/log/zap.gowith thread-safe init viasync.Oncelogrus.FieldLoggerto*slog.Loggeracross agent, operator, and Hubble componentsInitializeMetrics()now takes*slog.Loggerwith nil-safety guardSetDefaultSlog()at startup soslog.Default()routes through zap → Application InsightsHubble control plane
ServiceCachecell (replaced by frontends table), updateInitK8sSubsystemsignature, replaceNoOpOrchestratorwithfakedp.FakeOrchestratorIdentityUpdaterin IPCache config to prevent nil-panic on label injection pathsskipUnknownCGroupIDsparameter from Hubble parser constructor (removed in v1.19)Operator
CreateUpdateCRD,GetPregeneratedCRDnow take*slog.Logger)cilium/operator/k8sbehind_linuxbuild tagBuild & CI
ITracermock (pkg/plugin/common/mocks/)Related Issue
Fixes #1788
Checklist
git commit -S -s ...).Testing Completed
retinaTest-v119) with Hubble mode, all 3 agent pods + operator healthy (1/1):10093/metricsquay.io/cilium/v1.19.3 imagesgo build ./...,make lint(0 issues), unit tests all pass