Kubernetes Blog The Kubernetes blog is used by the project to communicate new features, community reports, and any news that might be relevant to the Kubernetes community.
-
Announcing Ingress2Gateway 1.0: Your Path to Gateway API
on March 20, 2026 at 7:00 pm
With the Ingress-NGINX retirement scheduled for March 2026, the Kubernetes networking landscape is at a turning point. For most organizations, the question isn’t whether to migrate to Gateway API, but how to do so safely. Migrating from Ingress to Gateway API is a fundamental shift in API design. Gateway API provides a modular, extensible API with strong support for Kubernetes-native RBAC. Conversely, the Ingress API is simple, and implementations such as Ingress-NGINX extend the API through esoteric annotations, ConfigMaps, and CRDs. Migrating away from Ingress controllers such as Ingress-NGINX presents the daunting task of capturing all the nuances of the Ingress controller, and mapping that behavior to Gateway API. Ingress2Gateway is an assistant that helps teams confidently move from Ingress to Gateway API. It translates Ingress resources/manifests along with implementation-specific annotations to Gateway API while warning you about untranslatable configuration and offering suggestions. Today, SIG Network is proud to announce the 1.0 release of Ingress2Gateway. This milestone represents a stable, tested migration assistant for teams ready to modernize their networking stack. Ingress2Gateway 1.0 Ingress-NGINX annotation support The main improvement for the 1.0 release is more comprehensive Ingress-NGINX support. Before the 1.0 release, Ingress2Gateway only supported three Ingress-NGINX annotations. For the 1.0 release, Ingress2Gateway supports over 30 common annotations (CORS, backend TLS, regex matching, path rewrite, etc.). Comprehensive integration testing Each supported Ingress-NGINX annotation, and representative combinations of common annotations, is backed by controller-level integration tests that verify the behavioral equivalence of the Ingress-NGINX configuration and the generated Gateway API. These tests exercise real controllers in live clusters and compare runtime behavior (routing, redirects, rewrites, etc.), not just YAML structure. The tests: spin up an Ingress-NGINX controller spin up multiple Gateway API controllers apply Ingress resources that have implementation-specific configuration translate Ingress resources to Gateway API with ingress2gateway and apply generated manifests verify that the Gateway API controllers and the Ingress controller exhibit equivalent behavior. A comprehensive test suite not only catches bugs in development, but also ensures the correctness of the translation, especially given surprising edge cases and unexpected defaults, so that you don’t find out about them in production. Notification & error handling Migration is not a “one-click” affair. Surfacing subtleties and untranslatable behavior is as important as translating supported configuration. The 1.0 release cleans up the formatting and content of notifications, so it is clear what is missing and how you can fix it. Using Ingress2Gateway Ingress2Gateway is a migration assistant, not a one-shot replacement. Its goal is to migrate supported Ingress configuration and behavior identify unsupported configuration and suggest alternatives reevaluate and potentially discard undesirable configuration The rest of the section shows you how to safely migrate the following Ingress-NGINX configuration apiVersion: networking.k8s.io/v1 kind: Ingress metadata: annotations: nginx.ingress.kubernetes.io/proxy-body-size: “1G” nginx.ingress.kubernetes.io/use-regex: “true” nginx.ingress.kubernetes.io/proxy-send-timeout: “1” nginx.ingress.kubernetes.io/proxy-read-timeout: “1” nginx.ingress.kubernetes.io/enable-cors: “true” nginx.ingress.kubernetes.io/configuration-snippet: | more_set_headers “Request-Id: $req_id”; name: my-ingress namespace: my-ns spec: ingressClassName: nginx rules: – host: my-host.example.com http: paths: – backend: service: name: website-service port: number: 80 path: /users/(\d+) pathType: ImplementationSpecific tls: – hosts: – my-host.example.com secretName: my-secret 1. Install Ingress2Gateway If you have a Go environment set up, you can install Ingress2Gateway with go install github.com/kubernetes-sigs/ingress2gateway@v1.0.0 Otherwise, brew install ingress2gateway You can also download the binary from GitHub or build from source. 2. Run Ingress2Gateway You can pass Ingress2Gateway Ingress manifests, or have the tool read directly from your cluster. # Pass it files ingress2gateway print –input-file my-manifest.yaml,my-other-manifest.yaml –providers=ingress-nginx > gwapi.yaml # Use a namespace in your cluster ingress2gateway print –namespace my-api –providers=ingress-nginx > gwapi.yaml # Or your whole cluster ingress2gateway print –providers=ingress-nginx –all-namespaces > gwapi.yaml Note:You can also pass –emitter <agentgateway|envoy-gateway|kgateway> to output implementation-specific extensions. 3. Review the output This is the most critical step. The commands from the previous section output a Gateway API manifest to gwapi.yaml, and they also emit warnings that explain what did not translate exactly and what to review manually. apiVersion: gateway.networking.k8s.io/v1 kind: Gateway metadata: annotations: gateway.networking.k8s.io/generator: ingress2gateway-dev name: nginx namespace: my-ns spec: gatewayClassName: nginx listeners: – hostname: my-host.example.com name: my-host-example-com-http port: 80 protocol: HTTP – hostname: my-host.example.com name: my-host-example-com-https port: 443 protocol: HTTPS tls: certificateRefs: – group: “” kind: Secret name: my-secret — apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: annotations: gateway.networking.k8s.io/generator: ingress2gateway-dev name: my-ingress-my-host-example-com namespace: my-ns spec: hostnames: – my-host.example.com parentRefs: – name: nginx port: 443 rules: – backendRefs: – name: website-service port: 80 filters: – cors: allowCredentials: true allowHeaders: – DNT – Keep-Alive – User-Agent – X-Requested-With – If-Modified-Since – Cache-Control – Content-Type – Range – Authorization allowMethods: – GET – PUT – POST – DELETE – PATCH – OPTIONS allowOrigins: – ‘*’ maxAge: 1728000 type: CORS matches: – path: type: RegularExpression value: (?i)/users/(\d+).* name: rule-0 timeouts: request: 10s — apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: annotations: gateway.networking.k8s.io/generator: ingress2gateway-dev name: my-ingress-my-host-example-com-ssl-redirect namespace: my-ns spec: hostnames: – my-host.example.com parentRefs: – name: nginx port: 80 rules: – filters: – requestRedirect: scheme: https statusCode: 308 type: RequestRedirect Ingress2Gateway successfully translated some annotations into their Gateway API equivalents. For example, the nginx.ingress.kubernetes.io/enable-cors annotation was translated into a CORS filter. But upon closer inspection, the nginx.ingress.kubernetes.io/proxy-{read,send}-timeout and nginx.ingress.kubernetes.io/proxy-body-size annotations do not map perfectly. The logs show the reason for these omissions as well as reasoning behind the translation. ┌─ WARN ──────────────────────────────────────── │ Unsupported annotation nginx.ingress.kubernetes.io/configuration-snippet │ source: INGRESS-NGINX │ object: Ingress: my-ns/my-ingress └─ ┌─ INFO ──────────────────────────────────────── │ Using case-insensitive regex path matches. You may want to change this. │ source: INGRESS-NGINX │ object: HTTPRoute: my-ns/my-ingress-my-host-example-com └─ ┌─ WARN ──────────────────────────────────────── │ ingress-nginx only supports TCP-level timeouts; i2gw has made a best-effort translation to Gateway API timeouts.request. Please verify that this meets your needs. See documentation: https://gateway-api.sigs.k8s.io/guides/http-timeouts/ │ source: INGRESS-NGINX │ object: HTTPRoute: my-ns/my-ingress-my-host-example-com └─ ┌─ WARN ──────────────────────────────────────── │ Failed to apply my-ns.my-ingress.metadata.annotations.”nginx.ingress.kubernetes.io/proxy-body-size” from my-ns/my-ingress: Most Gateway API implementations have reasonable body size and buffering defaults │ source: STANDARD_EMITTER │ object: HTTPRoute: my-ns/my-ingress-my-host-example-com └─ ┌─ WARN ──────────────────────────────────────── │ Gateway API does not support configuring URL normalization (RFC 3986, Section 6). Please check if this matters for your use case and consult implementation-specific details. │ source: STANDARD_EMITTER └─ There is a warning that Ingress2Gateway does not support the nginx.ingress.kubernetes.io/configuration-snippet annotation. You will have to check your Gateway API implementation documentation to see if there is a way to achieve equivalent behavior. The tool also notified us that Ingress-NGINX regex matches are case-insensitive prefix matches, which is why there is a match pattern of (?i)/users/(\d+).*. Most organizations will want to change this behavior to be an exact case-sensitive match by removing the leading (?i) and the trailing .* from the path pattern. Ingress2Gateway made a best-effort translation from the nginx.ingress.kubernetes.io/proxy-{send,read}-timeout annotations to a 10 second request timeout in our HTTP route. If requests for this service should be much shorter, say 3 seconds, you can make the corresponding changes to your Gateway API manifests. Also, nginx.ingress.kubernetes.io/proxy-body-size does not have a Gateway API equivalent, and was thus not translated. However, most Gateway API implementations have reasonable defaults for maximum body size and buffering, so this might not be a problem in practice. Further, some emitters might offer support for this annotation through implementation-specific extensions. For example, adding the –emitter agentgateway, –emitter envoy-gateway, or –emitter kgateway flag to the previous ingress2gateway print command would have resulted in additional implementation-specific configuration in the generated Gateway API manifests that attempted to capture the body size configuration. We also see a warning about URL normalization. Gateway API implementations such as Agentgateway, Envoy Gateway, Kgateway, and Istio have some level of URL normalization, but the behavior varies across implementations and is not configurable through standard Gateway API. You should check and test the URL normalization behavior of your Gateway API implementation to ensure it is compatible with your use case. To match Ingress-NGINX default behavior, Ingress2Gateway also added a listener on port 80 and a HTTP Request redirect filter to redirect HTTP traffic to HTTPS. You may not want to serve HTTP traffic at all and remove the listener on port 80 and the corresponding HTTPRoute. Caution:Always thoroughly review the generated output and logs. After manually applying these changes, the Gateway API manifests might look as follows. — apiVersion: gateway.networking.k8s.io/v1 kind: Gateway metadata: annotations: gateway.networking.k8s.io/generator: ingress2gateway-dev name: nginx namespace: my-ns spec: gatewayClassName: nginx listeners: – hostname: my-host.example.com name: my-host-example-com-https port: 443 protocol: HTTPS tls: certificateRefs: – group: “” kind: Secret name: my-secret — apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: annotations: gateway.networking.k8s.io/generator: ingress2gateway-dev name: my-ingress-my-host-example-com namespace: my-ns spec: hostnames: – my-host.example.com parentRefs: – name: nginx port: 443 rules: – backendRefs: – name: website-service port: 80 filters: – cors: allowCredentials: true allowHeaders: – DNT … allowMethods: – GET … allowOrigins: – ‘*’ maxAge: 1728000 type: CORS matches: – path: type: RegularExpression value: /users/(\d+) name: rule-0 timeouts: request: 3s 4. Verify Now that you have Gateway API manifests, you should thoroughly test them in a development cluster. In this case, you should at least double-check that your Gateway API implementation’s maximum body size defaults are appropriate for you and verify that a three-second timeout is enough. After validating behavior in a development cluster, deploy your Gateway API configuration alongside your existing Ingress. We strongly suggest that you then gradually shift traffic using weighted DNS, your cloud load balancer, or traffic-splitting features of your platform. This way, you can quickly recover from any misconfiguration that made it through your tests. Finally, when you have shifted all your traffic to your Gateway API controller, delete your Ingress resources and uninstall your Ingress controller. Conclusion The Ingress2Gateway 1.0 release is just the beginning, and we hope that you use Ingress2Gateway to safely migrate to Gateway API. As we approach the March 2026 Ingress-NGINX retirement, we invite the community to help us increase our configuration coverage, expand testing, and improve UX. Resources about Gateway API The scope of Gateway API can be daunting. Here are some resources to help you work with Gateway API: Listener sets allow application developers to manage gateway listeners. gwctl gives you a comprehensive view of your Gateway resources, such as attachments and linter errors. Gateway API Slack: #sig-network-gateway-api on Kubernetes Slack Ingress2Gateway Slack: #sig-network-ingress2gateway on Kubernetes Slack GitHub: kubernetes-sigs/ingress2gateway
-
Running Agents on Kubernetes with Agent Sandbox
on March 20, 2026 at 6:00 pm
The landscape of artificial intelligence is undergoing a massive architectural shift. In the early days of generative AI, interacting with a model was often treated as a transient, stateless function call: a request that spun up, executed for perhaps 50 milliseconds, and terminated. Today, the world is witnessing AI v2 eating AI v1. The ecosystem is moving from short-lived, isolated tasks to deploying multiple, coordinated AI agents that run constantly. These autonomous agents need to maintain context, use external tools, write and execute code, and communicate with one another over extended periods. As platform engineering teams look for the right infrastructure to host these new AI workloads, one platform stands out as the natural choice: Kubernetes. However, mapping these unique agentic workloads to traditional Kubernetes primitives requires a new abstraction. This is where the new Agent Sandbox project (currently in development under SIG Apps) comes into play. The Kubernetes advantage (and the abstraction gap) Kubernetes is the de facto standard for orchestrating cloud-native applications precisely because it solves the challenges of extensibility, robust networking, and ecosystem maturity. However, as AI evolves from short-lived inference requests to long-running, autonomous agents, we are seeing the emergence of a new operational pattern. AI agents, by contrast, are typically isolated, stateful, singleton workloads. They act as a digital workspace or execution environment for an LLM. An agent needs a persistent identity and a secure scratchpad for writing and executing (often untrusted) code. Crucially, because these long-lived agents are expected to be mostly idle except for brief bursts of activity, they require a lifecycle that supports mechanisms like suspension and rapid resumption. While you could theoretically approximate this by stringing together a StatefulSet of size 1, a headless Service, and a PersistentVolumeClaim for every single agent, managing this at scale becomes an operational nightmare. Because of these unique properties, traditional Kubernetes primitives don’t perfectly align. Introducing Kubernetes Agent Sandbox To bridge this gap, SIG Apps is developing agent-sandbox. The project introduces a declarative, standardized API specifically tailored for singleton, stateful workloads like AI agent runtimes. At its core, the project introduces the Sandbox CRD. It acts as a lightweight, single-container environment built entirely on Kubernetes primitives, offering: Strong isolation for untrusted code: When an AI agent generates and executes code autonomously, security is paramount. The Sandbox custom resource natively supports different runtimes, like gVisor or Kata Containers. This provides the necessary kernel and network isolation required for multi-tenant, untrusted execution. Lifecycle management: Unlike traditional web servers optimized for steady, stateless traffic, an AI agent operates as a stateful workspace that may be idle for hours between tasks. Agent Sandbox supports scaling these idle environments to zero to save resources, while ensuring they can resume exactly where they left off. Stable identity: Coordinated multi-agent systems require stable networking. Every Sandbox is given a stable hostname and network identity, allowing distinct agents to discover and communicate with each other seamlessly. Scaling agents with extensions Because the AI space is moving incredibly quickly, we built an Extensions API layer that enables even faster iteration and development. Starting a new pod adds about a second of overhead. That’s perfectly fine when deploying a new version of a microservice, but when an agent is invoked after being idle, a one-second cold start breaks the continuity of the interaction. It forces the user or the orchestrating service to wait for the environment to provision before the model can even begin to think or act. SandboxWarmPool solves this by maintaining a pool of pre-provisioned Sandbox pods, effectively eliminating cold starts. Users or orchestration services can simply issue a SandboxClaim against a SandboxTemplate, and the controller immediately hands over a pre-warmed, fully isolated environment to the agent. Quick start Ready to try it yourself? You can install the Agent Sandbox core components and extensions directly into your learning or sandbox cluster, using your chosen release. We recommend you use the latest release as the project is moving fast. # Replace “vX.Y.Z” with a specific version tag (e.g., “v0.1.0″) from # https://github.com/kubernetes-sigs/agent-sandbox/releases export VERSION=”vX.Y.Z” # Install the core components: kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/${VERSION}/manifest.yaml # Install the extensions components (optional): kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/${VERSION}/extensions.yaml # Install the Python SDK (optional): # Create a virtual Python environment python3 -m venv .venv source .venv/bin/activate # Install from PyPI pip install k8s-agent-sandbox Once installed, you can try out the Python SDK for AI agents or deploy one of the ready-to-use examples to see how easy it is to spin up an isolated agent environment. The future of agents is cloud native Whether it’s a 50-millisecond stateless task, or a multi-week, mostly-idle collaborative process, extending Kubernetes with primitives designed specifically for isolated stateful singletons allows us to leverage all the robust benefits of the cloud-native ecosystem. The Agent Sandbox project is open source and community-driven. If you are building AI platforms, developing agentic frameworks, or are interested in Kubernetes extensibility, we invite you to get involved: Check out the project on GitHub: kubernetes-sigs/agent-sandbox Join the discussion in the #sig-apps and #agent-sandbox channels on the Kubernetes Slack.
-
Securing Production Debugging in Kubernetes
on March 18, 2026 at 6:00 pm
During production debugging, the fastest route is often broad access such as cluster-admin (a ClusterRole that grants administrator-level access), shared bastions/jump boxes, or long-lived SSH keys. It works in the moment, but it comes with two common problems: auditing becomes difficult, and temporary exceptions have a way of becoming routine. This post offers my recommendations for good practices applicable to existing Kubernetes environments with minimal tooling changes: Least privilege with RBAC Short-lived, identity-bound credentials An SSH-style handshake model for cloud native debugging A good architecture for securing production debugging workflows is to use a just-in-time secure shell gateway (often deployed as an on demand pod in the cluster). It acts as an SSH-style “front door” that makes temporary access actually temporary. You can authenticate with short-lived, identity-bound credentials, establish a session to the gateway, and the gateway uses the Kubernetes API and RBAC to control what they can do, such as pods/log, pods/exec, and pods/portforward. Sessions expire automatically, and both the gateway logs and Kubernetes audit logs capture who accessed what and when without shared bastion accounts or long-lived keys. 1) Using an access broker on top of Kubernetes RBAC RBAC controls who can do what in Kubernetes. Many Kubernetes environments rely primarily on RBAC for authorization, although Kubernetes also supports other authorization modes such as Webhook authorization. You can enforce access directly with Kubernetes RBAC, or put an access broker in front of the cluster that still relies on Kubernetes permissions under the hood. In either model, Kubernetes RBAC remains the source of truth for what the Kubernetes API allows and at what scope. An access broker adds controls that RBAC does not cover well. For example, it can decide whether a request is auto-approved or requires manual approval, whether a user can run a command, and which commands are allowed in a session. It can also manage group membership so that you grant permissions to groups instead of individual users. Kubernetes RBAC can allow actions such as pods/exec, but it cannot restrict which commands run inside an exec session. With that model, Kubernetes RBAC defines the allowed actions for a user or group (for example, an on-call team in a single namespace). I recommend you only define access rules that grant rights to groups or to ServiceAccounts – never to individual users. The broker or identity provider then adds or removes users from that group as needed. The broker can also enforce extra policy on top, like which commands are permitted in an interactive session and which requests can be auto-approved versus require manual approval. That policy can live in a JSON or XML file and be maintained through code review, so updates go through a formal pull request and are reviewed like any other production change. Example: a namespaced on-call debug Role apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: oncall-debug namespace: <namespace> rules: # Discover what’s running – apiGroups: [“”] resources: [“pods”, “events”] verbs: [“get”, “list”, “watch”] # Read logs – apiGroups: [“”] resources: [“pods/log”] verbs: [“get”] # Interactive debugging actions – apiGroups: [“”] resources: [“pods/exec”, “pods/portforward”] verbs: [“create”] # Understand rollout/controller state – apiGroups: [“apps”] resources: [“deployments”, “replicasets”] verbs: [“get”, “list”, “watch”] # Optional: allow kubectl debug ephemeral containers – apiGroups: [“”] resources: [“pods/ephemeralcontainers”] verbs: [“update”] Bind the Role to a group (rather than individual users) so membership can be managed through your identity provider: apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: oncall-debug namespace: <namespace> subjects: – kind: Group name: oncall-<team-name> apiGroup: rbac.authorization.k8s.io roleRef: kind: Role name: oncall-debug apiGroup: rbac.authorization.k8s.io 2) Short-lived, identity-bound credentials The goal is to use short-lived, identity-bound credentials that clearly tie a session to a real person and expire quickly. These credentials can include the user’s identity and the scope of what they’re allowed to do. They’re typically signed using a private key that stays with the engineer, such as a hardware-backed key (for example, a YubiKey), so they can not be forged without access to that key. You can implement this with Kubernetes-native authentication (for example, client certificates or an OIDC-based flow), or have the access broker from the previous section issue short-lived credentials on the user’s behalf. In many setups, Kubernetes still uses RBAC to enforce permissions based on the authenticated identity and groups/claims. If you use an access broker, it can also encode additional scope constraints in the credential and enforce them during the session, such as which cluster or namespace the session applies to and which actions (or approved commands) are allowed against pods or nodes. In either case, the credentials should be signed by a certificate authority (CA), and that CA should be rotated on a regular schedule (for example, quarterly) to limit long-term risk. Option A: short-lived OIDC tokens A lot of managed Kubernetes clusters already give you short-lived tokens. The main thing is to make sure your kubeconfig refreshes them automatically instead of copying a long-lived token into the file. For example: users: – name: oncall user: exec: apiVersion: client.authentication.k8s.io/v1 command: cred-helper args: [“–cluster=prod”, “–ttl=30m”] Option B: Short-lived client certificates (X.509) If your API server (or your access broker from the previous section) is set up to trust a client CA, you can use short-lived client certificates for debugging access. The idea is: The private key is created and kept under the engineer’s machine (ideally hardware-backed, like a non-exportable key in a YubiKey/PIV token) A short-lived certificate is issued (often via the CertificateSigningRequest API, or your access broker from the previous section, with a TTL). RBAC maps the authenticated identity to a minimal Role This is straightforward to operationalize with the Kubernetes CertificateSigningRequest API. Generate a key and CSR locally: # Generate a private key. # This could instead be generated within a hardware token; # OpenSSL and several similar tools include support for that. openssl genpkey -algorithm Ed25519 -out oncall.key openssl req -new -key oncall.key -out oncall.csr \ -subj “/CN=user/O=oncall-payments” Create a CertificateSigningRequest with a short expiration: apiVersion: certificates.k8s.io/v1 kind: CertificateSigningRequest metadata: name: oncall-<user>-20260218 spec: request: <base64-encoded oncall.csr> signerName: kubernetes.io/kube-apiserver-client expirationSeconds: 1800 # 30 minutes usages: – client auth After the CSR is approved and signed, you extract the issued certificate and use it together with the private key to authenticate, for example via kubectl. 3) Use a just-in-time access gateway to run debugging commands Once you have short-lived credentials, you can use them to open a secure shell session to a just-in-time access gateway, often exposed over SSH and created on demand. If the gateway is exposed over SSH, a common pattern is to issue the engineer a short-lived OpenSSH user certificate for the session. The gateway trusts your SSH user CA, authenticates the engineer at connection time, and then applies the approved session policy before making Kubernetes API calls on the user’s behalf. OpenSSH certificates are separate from Kubernetes X.509 client certificates, so these are usually treated as distinct layers. The resulting session should also be scoped so it cannot be reused outside of what was approved. For example, the gateway or broker can limit it to a specific cluster and namespace, and optionally to a narrower target such as a pod or node. That way, even if someone tries to reuse the access, it will not work outside the intended scope. After the session is established, the gateway executes only the allowed actions and records what happened for auditing. Example: Namespace-scoped role bindings apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: jit-debug namespace: <namespace> annotations: kubernetes.io/description: > Colleagues performing semi-privileged debugging, with access provided just in time and on demand. rules: – apiGroups: [“”] resources: [“pods”, “pods/log”] verbs: [“get”, “list”, “watch”] – apiGroups: [“”] resources: [“pods/exec”] verbs: [“create”] — apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: jit-debug namespace: <namespace> subjects: – kind: Group name: jit:oncall:<namespace> # mapped from the short-lived credential (cert/OIDC) apiGroup: rbac.authorization.k8s.io roleRef: kind: Role name: jit-debug apiGroup: rbac.authorization.k8s.io These RBAC objects, and the rules they define, allow debugging only within the specified namespace; attempts to access other namespaces are not allowed. Example: Cluster-scoped role binding apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: jit-cluster-read rules: – apiGroups: [“”] resources: [“nodes”, “namespaces”] verbs: [“get”, “list”, “watch”] — apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: jit-cluster-read subjects: – kind: Group name: jit:oncall:cluster apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: jit-cluster-read apiGroup: rbac.authorization.k8s.io These RBAC rules grant cluster-wide read access (for example, to nodes and namespaces) and should be used only for workflows that truly require cluster-scoped resources. Finer-grained restrictions like “only this pod/node” or “only these commands” are typically enforced by the access gateway/broker during the session, but Kubernetes also offers other options, such as ValidatingAdmissionPolicy for restricting writes and webhook authorization for custom authorization across verbs. In environments with stricter access controls, you can add an extra, short-lived session mediation layer to separate session establishment from privileged actions. Both layers are ephemeral, use identity-bound expiring credentials, and produce independent audit trails. The mediation layer handles session setup/forwarding, while the execution layer performs only RBAC-authorized Kubernetes actions. This separation can reduce exposure by narrowing responsibilities, scoping credentials per step, and enforcing end-to-end session expiry. References Authorization Using RBAC Authorization Authenticating Certificates and Certificate Signing Requests Issue a Certificate for a Kubernetes API Client Using a CertificateSigningRequest Role Based Access Control Good Practices Disclaimer: The views expressed in this post are solely those of the author and do not reflect the views of the author’s employer or any other organization.
-
The Invisible Rewrite: Modernizing the Kubernetes Image Promoter
on March 17, 2026 at 12:00 am
Every container image you pull from registry.k8s.io got there through kpromo, the Kubernetes image promoter. It copies images from staging registries to production, signs them with cosign, replicates signatures across more than 20 regional mirrors, and generates SLSA provenance attestations. If this tool breaks, no Kubernetes release ships. Over the past few weeks, we rewrote its core from scratch, deleted 20% of the codebase, made it dramatically faster, and nobody noticed. That was the whole point. A bit of history The image promoter started in late 2018 as an internal Google project by Linus Arver. The goal was simple: replace the manual, Googler-gated process of copying container images into k8s.gcr.io with a community-owned, GitOps-based workflow. Push to a staging registry, open a PR with a YAML manifest, get it reviewed and merged, and automation handles the rest. KEP-1734 formalized this proposal. In early 2019, the code moved to kubernetes-sigs/k8s-container-image-promoter and grew quickly. Over the next few years, Stephen Augustus consolidated multiple tools (cip, gh2gcs, krel promote-images, promobot-files) into a single CLI called kpromo. The repository was renamed to promo-tools. Adolfo Garcia Veytia (Puerco) added cosign signing and SBOM support. Tyler Ferrara built vulnerability scanning. Carlos Panato kept the project in a healthy and releasable state. 42 contributors made about 3,500 commits across more than 60 releases. It worked. But by 2025 the codebase carried the weight of seven years of incremental additions from multiple SIGs and subprojects. The README said it plainly: you will see duplicated code, multiple techniques for accomplishing the same thing, and several TODOs. The problems we needed to solve Production promotion jobs for Kubernetes core images regularly took over 30 minutes and frequently failed with rate limit errors. The core promotion logic had grown into a monolith that was hard to extend and difficult to test, making new features like provenance or vulnerability scanning painful to add. On the SIG Release roadmap, two work items had been sitting for a while: “Rewrite artifact promoter” and “Make artifact validation more robust”. We had discussed these at SIG Release meetings and KubeCons, and the open research spikes on project board #171 captured eight questions that needed answers before we could move forward. One issue to answer them all In February 2026, we opened issue #1701 (“Rewrite artifact promoter pipeline”) and answered all eight spikes in a single tracking issue. The rewrite was deliberately phased so that each step could be reviewed, merged, and validated independently. Here is what we did: Phase 1: Rate Limiting (#1702). Rewrote rate limiting to properly throttle all registry operations with adaptive backoff. Phase 2: Interfaces (#1704). Put registry and auth operations behind clean interfaces so they can be swapped out and tested independently. Phase 3: Pipeline Engine (#1705). Built a pipeline engine that runs promotion as a sequence of distinct phases instead of one large function. Phase 4: Provenance (#1706). Added SLSA provenance verification for staging images. Phase 5: Scanner and SBOMs (#1709). Added vulnerability scanning and SBOM support. Flipped the default to the new pipeline engine. At this point we cut v4.2.0 and let it soak in production before continuing. Phase 6: Split Signing from Replication (#1713). Separated image signing from signature replication into their own pipeline phases, eliminating the rate limit contention that caused most production failures. Phase 7: Remove Legacy Pipeline (#1712). Deleted the old code path entirely. Phase 8: Remove Legacy Dependencies (#1716). Deleted the audit subsystem, deprecated tools, and e2e test infrastructure. Phase 9: Delete the Monolith (#1718). Removed the old monolithic core and its supporting packages. Thousands of lines deleted across phases 7 through 9. Each phase shipped independently. v4.3.0 followed the next day with the legacy code fully removed. With the new architecture in place, a series of follow-up improvements landed: parallelized registry reads (#1736), retry logic for all network operations (#1742), per-request timeouts to prevent pipeline hangs (#1763), HTTP connection reuse (#1759), local registry integration tests (#1746), the removal of deprecated credential file support (#1758), a rework of attestation handling to use cosign’s OCI APIs and the removal of deprecated SBOM support (#1764), and a dedicated promotion record predicate type registered with the in-toto attestation framework (#1767). These would have been much harder to land without the clean separation the rewrite provided. v4.4.0 shipped all of these improvements and enabled provenance generation and verification by default. The new pipeline The promotion pipeline now has seven clearly separated phases: graph LR Setup –> Plan –> Provenance –> Validate –> Promote –> Sign –> Attest Phase What it does Setup Validate options, prewarm TUF cache. Plan Parse manifests, read registries, compute which images need promotion. Provenance Verify SLSA attestations on staging images. Validate Check cosign signatures, exit here for dry runs. Promote Copy images server-side, preserving digests. Sign Sign promoted images with keyless cosign. Attest Generate promotion provenance attestations using a dedicated in-toto predicate type. Phases run sequentially, so each one gets exclusive access to the full rate limit budget. No more contention. Signature replication to mirror registries is no longer part of this pipeline and runs as a dedicated periodic Prow job instead. Making it fast With the architecture in place, we turned to performance. Parallel registry reads (#1736): The plan phase reads 1,350 registries. We parallelized this and the plan phase dropped from about 20 minutes to about 2 minutes. Two-phase tag listing (#1761): Instead of checking all 46,000 image groups across more than 20 mirrors, we first check only the source repositories. About 57% of images have no signatures at all because they were promoted before signing was enabled. We skip those entirely, cutting API calls roughly in half. Source check before replication (#1727): Before iterating all mirrors for a given image, we check if the signature exists on the primary registry first. In steady state where most signatures are already replicated, this reduced the work from about 17 hours to about 15 minutes. Per-request timeouts (#1763): We observed intermittent hangs where a stalled connection blocked the pipeline for over 9 hours. Every network operation now has its own timeout and transient failures are retried automatically. Connection reuse (#1759): We started reusing HTTP connections and auth state across operations, eliminating redundant token negotiations. This closed a long-standing request from 2023. By the numbers Here is what the rewrite looks like in aggregate. Over 40 PRs merged, 3 releases shipped (v4.2.0, v4.3.0, v4.4.0) Over 10,000 lines added and over 16,000 lines deleted, a net reduction of about 5,000 lines (20% smaller codebase) Performance drastically improved across the board Robustness improved with retry logic, per-request timeouts, and adaptive rate limiting 19 long-standing issues closed The codebase shrank by a fifth while gaining provenance attestations, a pipeline engine, vulnerability scanning integration, parallelized operations, retry logic, integration tests against local registries, and a standalone signature replication mode. No user-facing changes This was a hard requirement. The kpromo cip command accepts the same flags and reads the same YAML manifests. The post-k8sio-image-promo Prow job continued working throughout. The promotion manifests in kubernetes/k8s.io did not change. Nobody had to update their workflows or configuration. We caught two regressions early in production. One (#1731) caused a registry key mismatch that made every image appear as “lost” so that nothing was promoted. Another (#1733) set the default thread count to zero, blocking all goroutines. Both were fixed within hours. The phased release strategy (v4.2.0 with the new engine, v4.3.0 with legacy code removed) gave us a clear rollback path that we fortunately never needed. What comes next Signature replication across all mirror registries remains the most expensive part of the promotion cycle. Issue #1762 proposes eliminating it entirely by having archeio (the registry.k8s.io redirect service) route signature tag requests to a single canonical upstream instead of per-region backends. Another option would be to move signing closer to the registry infrastructure itself. Both approaches need further discussion with the SIG Release and infrastructure teams, but either one would remove thousands of API calls per promotion cycle and simplify the codebase even further. Thank you This project has been a community effort spanning seven years. Thank you to Linus, Stephen, Adolfo, Carlos, Ben, Marko, Lauri, Tyler, Arnaud, and many others who contributed code, reviews, and planning over the years. The SIG Release and Release Engineering communities provided the context, the discussions, and the patience for a rewrite of infrastructure that every Kubernetes release depends on. If you want to get involved, join us in #release-management on the Kubernetes Slack or check out the repository.
-
Announcing the AI Gateway Working Group
on March 9, 2026 at 6:00 pm
The community around Kubernetes includes a number of Special Interest Groups (SIGs) and Working Groups (WGs) facilitating discussions on important topics between interested contributors. Today, we’re excited to announce the formation of the AI Gateway Working Group, a new initiative focused on developing standards and best practices for networking infrastructure that supports AI workloads in Kubernetes environments. What is an AI Gateway? In a Kubernetes context, an AI Gateway refers to network gateway infrastructure (including proxy servers, load-balancers, etc.) that generally implements the Gateway API specification with enhanced capabilities for AI workloads. Rather than defining a distinct product category, AI Gateways describe infrastructure designed to enforce policy on AI traffic, including: Token-based rate limiting for AI APIs. Fine-grained access controls for inference APIs. Payload inspection enabling intelligent routing, caching, and guardrails. Support for AI-specific protocols and routing patterns. Working group charter and mission The AI Gateway Working Group operates under a clear charter with the mission to develop proposals for Kubernetes Special Interest Groups (SIGs) and their sub-projects. Its primary goals include: Standards Development: Create declarative APIs, standards, and guidance for AI workload networking in Kubernetes. Community Collaboration: Foster discussions and build consensus around best practices for AI infrastructure. Extensible Architecture: Ensure composability, pluggability, and ordered processing for AI-specific gateway extensions. Standards-Based Approach: Build on established networking foundations, layering AI-specific capabilities on top of proven standards. Active proposals WG AI Gateway currently has several active proposals that address key challenges in AI workload networking: Payload Processing The payload processing proposal addresses the critical need for AI workloads to inspect and transform full HTTP request and response payloads. This enables: AI Inference Security Guard against malicious prompts and prompt injection attacks. Content filtering for AI responses. Signature-based detection and anomaly detection for AI traffic. AI Inference Optimization Semantic routing based on request content. Intelligent caching to reduce inference costs and improve response times. RAG (Retrieval-Augmented Generation) system integration for context enhancement. The proposal defines standards for declarative payload processor configuration, ordered processing pipelines, and configurable failure modes – all essential for production AI workload deployments. Egress gateways Modern AI applications increasingly depend on external inference services, whether for specialized models, failover scenarios, or cost optimization. The egress gateways proposal aims to define standards for securely routing traffic outside the cluster. Key features include: External AI Service Integration Secure access to cloud-based AI services (OpenAI, Vertex AI, Bedrock, etc.). Managed authentication and token injection for third-party AI APIs. Regional compliance and failover capabilities. Advanced Traffic Management Backend resource definitions for external FQDNs and services. TLS policy management and certificate authority control. Cross-cluster routing for centralized AI infrastructure. User Stories We’re Addressing Platform operators providing managed access to external AI services. Developers requiring inference failover across multiple cloud providers. Compliance engineers enforcing regional restrictions on AI traffic. Organizations centralizing AI workloads on dedicated clusters. Upcoming events KubeCon + CloudNativeCon Europe 2026, Amsterdam AI Gateway working group members will be presenting at KubeCon + CloudNativeCon Europe in Amsterdam, discussing the problems at the intersection of AI and networking, including the working group’s active proposals, as well as the intersection of AI gateways with Model Context Protocol (MCP) and agent networking patterns. This session will showcase how AI Gateway working group proposals enable the infrastructure needed for next-generation AI deployments and communication patterns. The session will also include the initial designs, early prototypes, and emerging directions shaping the WG’s roadmap. For more details see our session here: AI’m at the Gate! Introducing the AI Gateway Working Group in Kubernetes Get involved The AI Gateway Working Group represents the Kubernetes community’s commitment to standardizing AI workload networking. As AI becomes increasingly integral to modern applications, we need robust, standardized infrastructure that can support the unique requirements of inference workloads while maintaining the security, observability, and reliability standards that Kubernetes users expect. Our proposals are currently in active development, with implementations beginning across various gateway projects. We’re working closely with SIG Network on Gateway API enhancements and collaborating with the broader cloud-native community to ensure our standards meet real-world production needs. Whether you’re a gateway implementer, platform operator, AI application developer, or simply interested in the intersection of Kubernetes and AI, we’d love your input. The working group follows an open contribution model – you can review our proposals, join our weekly meetings, or start discussions on our GitHub repository. To learn more: Visit the working group’s umbrella GitHub repository. Read the working group’s charter. Join the weekly meeting on Thursdays at 2PM EST. Connect with the working group on Slack (#wg-ai-gateway) (visit https://slack.k8s.io/ for an invitation). Join the AI Gateway mailing list. The future of AI infrastructure in Kubernetes is being built today, join up and learn how you can contribute and help shape the future of AI-aware gateway capabilities in Kubernetes.
