The Kubernetes operator that manages AI agents as first-class resources.
No randomness. No surprises. Every action tracked, every cost controlled, every tool permitted.
Website · Docs · Quick Start · Community
|
Non-root pods. Read-only filesystem. All capabilities dropped. Tool allow/deny lists with per-tool trust levels. MCP server auth enforced at admission. Output validation on every pipeline step - regex, JSON schema, and LLM-as-judge semantic grading. Prompt injection defense built in. Secrets always in Kubernetes Secrets, never inlined. Not configuration you opt into. How every agent runs. |
SwarmBudget enforces token limits with a hard stop. Tasks rejected before tokens are spent, not after. Per-action token tracking in the audit trail. Spend attribution across agents, tools, and pipeline steps. Daily limits, per-call ceilings, thinking token caps for reasoning models. Budget alerts via Slack, email, or webhook before you hit the wall. You set the ceiling. The operator enforces it. |
Pipeline DAGs with step dependencies and output validation. LLM-routed dispatch that picks the best agent per request. Agent-to-agent delegation with the advisor pattern. SwarmRegistry indexes capabilities so teams resolve agents at runtime, not hardcoded names. Event triggers - cron, webhook, chain - all as CRDs. Artifact handoff between steps via S3 or GCS. Context compression so agents run indefinitely without hitting context limits. |
|
Semantic health checks that ask the model, not just the container. Catches degraded models, broken tool connections, and prompt corruption that HTTP liveness misses. Structured audit trail with causal chain tracing across agents - every tool call, every token, every decision attributed and queryable via CLI. OTel metrics and traces. Notifications on budget thresholds, failures, and degradation via Slack, email, or webhook. |
KEDA-based autoscaling from queue depth. Zero replicas when idle, instant scale-up on demand. Multi-provider - Anthropic, OpenAI, Google Gemini, Ollama, or any provider via gRPC plugin. Change one field to switch. Pluggable backends for everything - queue, vector store, artifact storage. No bundled infrastructure, no vendor SDK dependency. Vector memory so agents recall findings across tasks via pgvector or Qdrant. Bring your own everything. |
SwarmPolicy enforces namespace-wide guardrails that agent authors cannot weaken. Token ceilings, tool restrictions, model allowlists, output validation requirements - all set by platform teams, all enforced by the operator. Audit-then-warn-then-enforce rollout. Agents re-evaluated on policy change. Compliance visible in kubectl. The LimitRange for AI agents. Platform teams set the rules. Agent authors work within them. No exceptions. |
- kubeswarm - Core operator and runtime
- helm-charts - Production Helm chart
- kubeswarm-cli - Run pipelines locally without a cluster
- kubeswarm-docs - Documentation
- kubeswarm-cookbook - Pipeline examples