Skip to main content

Sandbox runtime options

Each project gets a dedicated sandbox where the agent can execute commands against a persistent filesystem. Because agent behavior is non-deterministic, the sandbox must be isolated from other projects and from the host system.

Why isolation matters

The agent executes arbitrary commands, and its behavior cannot be fully predicted. Sandboxes need isolation for several reasons: to prevent one project from accessing another project’s data, to contain faults so a misbehaving sandbox doesn’t affect others, and to protect the host system from escape attempts. The options below represent different trade-offs between operational overhead and the size of the shared attack surface between the sandbox and the host.
OptionIsolation modelWhat it requires
Default runtimeShared kernel, Linux namespacesNothing beyond standard EKS
SysboxUser-namespace remapping, filesystem virtualizationRuntimeClass configuration on nodes
MicroVMSeparate kernel per containerHardware virtualization support

Option 1: Default runtime

Standard container isolation using Linux namespaces and cgroups. All containers share the host kernel. This is the fastest path to a working deployment. A standard EKS cluster with managed node groups works out of the box with no additional configuration. The attack surface is large. All containers on a node share the kernel, so kernel vulnerabilities are exploitable from any container. Isolation depends on namespace and cgroup boundaries, which can be bypassed through misconfigured capabilities, seccomp profiles, or mounted host paths. Information about the host is visible through /proc and /sys.

Option 2: Sysbox

Sysbox provides stronger isolation through user namespace remapping and filesystem virtualization. User namespace remapping means that root inside the container is an unprivileged user on the host. This closes many privilege escalation paths that exist in default containers. Sysbox also virtualizes /proc and /sys, preventing containers from discovering information about the host or other containers. Containers still share the host kernel, so kernel vulnerabilities remain a risk. Sysbox reduces the attack surface but does not eliminate it. What it requires:
  • A RuntimeClass resource pointing to the sysbox-runc handler
  • Nodes with Sysbox installed (via custom AMI or node bootstrap)
Karpenter is often used alongside Sysbox because it can provision Sysbox-configured nodes on demand. However, Karpenter is not required. A static node group or Cluster Autoscaler works as long as the nodes have Sysbox installed.

Option 3: MicroVM

MicroVM runtimes run each container in a lightweight virtual machine with its own kernel. Each sandbox is isolated at the hardware virtualization layer. The hypervisor enforces memory and CPU isolation between VMs, and each sandbox has its own kernel. A compromised sandbox kernel cannot affect other sandboxes or the host because the boundary is enforced by hardware, not by kernel data structures. The hypervisor itself has a much smaller attack surface than the Linux kernel. Kata Containers is the most common way to run MicroVMs in Kubernetes. It integrates with containerd via RuntimeClass and supports multiple hypervisor backends including QEMU, Firecracker, and Cloud Hypervisor. What it requires:
  • Bare metal EC2 instances (e.g., i3.metal, m5.metal) to expose hardware virtualization (KVM)
  • Kata Containers (or another MicroVM runtime) installed and configured
  • A RuntimeClass resource pointing to the Kata handler
Resource overhead is higher than container-based isolation because each sandbox runs a full (lightweight) VM.

Elasticity

New projects create new sandbox pods. When node capacity is exhausted, these pods remain pending until the cluster scales. With the default runtime, any node can run sandbox pods, so the standard EKS autoscaler handles this automatically. With Sysbox or MicroVM runtimes, sandbox pods can only schedule on nodes that have the runtime installed. The node pool for these workloads must be configured to scale independently, either through Cluster Autoscaler (scaling a dedicated node group) or Karpenter (provisioning runtime-configured nodes on demand).