Skip to content
@kubeswarm

kubeswarm.io

kubeswarm

LLM agents, reconciled.

The Kubernetes operator that manages AI agents as first-class resources.
No randomness. No surprises. Every action tracked, every cost controlled, every tool permitted.

Website · Docs · Quick Start · Community


Secure

Non-root pods. Read-only filesystem. All capabilities dropped. Tool allow/deny lists with per-tool trust levels. MCP server auth enforced at admission. Output validation on every pipeline step - regex, JSON schema, and LLM-as-judge semantic grading. Prompt injection defense built in. Secrets always in Kubernetes Secrets, never inlined. Not configuration you opt into. How every agent runs.

Cost-controlled

SwarmBudget enforces token limits with a hard stop. Tasks rejected before tokens are spent, not after. Per-action token tracking in the audit trail. Spend attribution across agents, tools, and pipeline steps. Daily limits, per-call ceilings, thinking token caps for reasoning models. Budget alerts via Slack, email, or webhook before you hit the wall. You set the ceiling. The operator enforces it.

Orchestrated

Pipeline DAGs with step dependencies and output validation. LLM-routed dispatch that picks the best agent per request. Agent-to-agent delegation with the advisor pattern. SwarmRegistry indexes capabilities so teams resolve agents at runtime, not hardcoded names. Event triggers - cron, webhook, chain - all as CRDs. Artifact handoff between steps via S3 or GCS. Context compression so agents run indefinitely without hitting context limits.

Observable

Semantic health checks that ask the model, not just the container. Catches degraded models, broken tool connections, and prompt corruption that HTTP liveness misses. Structured audit trail with causal chain tracing across agents - every tool call, every token, every decision attributed and queryable via CLI. OTel metrics and traces. Notifications on budget thresholds, failures, and degradation via Slack, email, or webhook.

Scalable

KEDA-based autoscaling from queue depth. Zero replicas when idle, instant scale-up on demand. Multi-provider - Anthropic, OpenAI, Google Gemini, Ollama, or any provider via gRPC plugin. Change one field to switch. Pluggable backends for everything - queue, vector store, artifact storage. No bundled infrastructure, no vendor SDK dependency. Vector memory so agents recall findings across tasks via pgvector or Qdrant. Bring your own everything.

Governed

SwarmPolicy enforces namespace-wide guardrails that agent authors cannot weaken. Token ceilings, tool restrictions, model allowlists, output validation requirements - all set by platform teams, all enforced by the operator. Audit-then-warn-then-enforce rollout. Agents re-evaluated on policy change. Compliance visible in kubectl. The LimitRange for AI agents. Platform teams set the rules. Agent authors work within them. No exceptions.



Get started in 2 minutes

License

Pinned Loading

  1. kubeswarm kubeswarm Public

    Kubernetes operator that manages AI agents as first-class resources

    Go 1

  2. kubeswarm-cli kubeswarm-cli Public

    Local development CLI for running AI agent pipelines without a Kubernetes cluster

    Go

  3. helm-charts helm-charts Public

    Helm charts for kubeswarm - AI agents as Kubernetes-native resources

    Go Template

Repositories

Showing 6 of 6 repositories

Top languages

Loading…

Most used topics

Loading…