The Kubernetes Native AI Gateway

Unified control plane for LLM traffic — model routing, team token budgets, guardrails, and observability via Kubernetes-style CRDs.

Requests / min
Across all teams
Tokens / min
Global throttling
Guardrail blocks
Last 60 minutes
P95 latency
Ingress → Model
Live routing simulation Policy: teams + guardrails + tracing
K8S INGRESS POLICY ENGINE Teams • Budgets • RBAC GUARDRAILS PII • Injection • Schema MODEL ROUTER
Status
HealthHealthy
ReadyReady
Last Refresh
Gateway Info
Version0.4.1
ProviderAWS EKS
ObservabilityTracing On
Active Models
NameStatus
gpt-4-turboActive
claude-3Active
llama-3.1-70bShadow
Token Limits
Global Daily Limit: 85% Used

Live Policy Simulation

Visualize how your configuration affects traffic flow and enforcement.

INGRESS TEAMS Budgets/RBAC GUARD LLM
in-flight blocked/redacted approved team policy

Teams & Token Budgets

Create teams, assign budgets, and apply rate limits.

Model Registry

Add upstream models, providers, and per-model controls.

Guardrails

Toggle enforcement modules and configure thresholds.

PII Redaction (WASM)
Toxicity Filter
Prompt Injection
Secrets Detector
JSON Schema Enforcement
Semantic Caching (Redis)

Rate Limiting

Observability

Enable tracing/metrics/logging exports and sampling.

OpenTelemetry Tracing
Prometheus Metrics
Request/Response Logs
PII-safe Log Redaction
LIVE PREVIEW: k8s-gateway.yaml Valid CRD