The Kubernetes Native AI Gateway

Unified control plane for LLM traffic — model routing, team token budgets, guardrails, and observability via Kubernetes-style CRDs.

Requests / min

—

Across all teams

Tokens / min

—

Global throttling

Guardrail blocks

—

Last 60 minutes

P95 latency

—

Ingress → Model

Live routing simulation Policy: teams + guardrails + tracing

Status

HealthHealthy

ReadyReady

Last Refresh

Gateway Info

Version0.4.1

ProviderAWS EKS

ObservabilityTracing On

Active Models

Token Limits

Global Daily Limit: 85% Used

Live Policy Simulation