Sandbox Security Model — Research & Recommendations
Status: Phase 1 Implemented (
security/sandbox-hardeningbranch) · Phase 2 Pending (infra team)
Date: 2025-02-18 (research) · 2025-02-19 (Phase 1 implementation)
Author: Agent-assisted research
Scope: infrastrActure sandbox container security, orchestration, and base image selection
Stable Baseline: v1.0.0 (tagdf14030, Docker imageinfrastrActure:v1.0.0)
1. Current Architecture
What We Have Today
The infrastrActure creates sandbox containers using the Docker Engine API directly via dockerode. Each sandbox:
- Runs a
sandbox-mcp-baseimage (custom) - Gets a unique host port (range 14000–14999, ~1000 concurrent sandboxes)
- Has configurable memory (default 512MB) and CPU limits (default 1 core)
- Has idle timeout (30 min) and max lifetime (24h)
- Lifecycle:
pending → creating → running → paused → stopped → destroyed - State tracked in PostgreSQL (
sandboxestable) - Container managed via
DockerService.createContainer()+performContainerAction()
Security Gaps Identified
| Gap | Current State | Risk Level | Status |
|---|---|---|---|
| No seccomp profile | Containers use Docker default seccomp | Medium | ⏳ Open |
| No capability dropping | High | ✅ Fixed — CapDrop ALL + minimal CapAdd | |
| No gVisor/Sysbox runtime | Standard runc — shared kernel | Medium | ⏳ Code ready, needs install on nodes |
| No network isolation | High | ✅ Fixed — Per-tenant bridge, Internal, no ICC | |
| No readonly rootfs | Low | ✅ Fixed — restricted profile uses readonly + tmpfs | |
| No PID limits | Medium | ✅ Fixed — 64/256/1024 per profile | |
| No user namespace remapping | Root in container = potential host root | High | ⏳ Sysbox handles this (Phase 2) |
| Docker socket not isolated | infrastrActure has Docker socket access | Medium | ⏳ Architectural (manager needs socket) |
Implementation Summary (Phase 1)
The following diagram shows the security hardening applied to every sandbox container as of the Phase 1 implementation:
Files changed: src/services/sandbox.ts (profiles, create flow), src/services/docker.ts (HostConfig fields, network CRUD)
2. Container Runtime Options
2.1 runc (Current — Docker Default)
What it is: The standard OCI container runtime. Uses Linux namespaces and cgroups for isolation.
Pros:
- Zero overhead — maximum performance
- Universal compatibility
- Well-understood security model
- Default on all Docker installations
Cons:
- Shared kernel — a kernel exploit = host compromise
- Default capabilities are generous (13+ capabilities)
- No syscall interception beyond seccomp profiles
Verdict: Sufficient for most enterprise use cases if properly hardened with seccomp + capability dropping + user namespace remapping.
2.2 gVisor (runsc) — Google's Application Kernel
What it is: A user-space kernel written in Go that intercepts all syscalls from containers. The container never directly touches the host kernel. OCI-compatible runtime that replaces runc.
How it works:
- Sentry (user-space kernel) handles syscalls
- Gofer handles filesystem operations
- Application → Sentry → limited host kernel calls
Installation:
# On the Docker host (mcp-server worker node)
sudo apt-get install -y runsc
sudo runsc install # Adds runtime to /etc/docker/daemon.json
sudo systemctl restart docker
# Usage
docker run --runtime=runsc myimage
Pros:
- Strongest syscall isolation without VMs
- Each container gets its own application kernel
- Supports ~380 of ~450 Linux syscalls (covers 99% of real workloads)
- Compatible with Docker, Kubernetes, dockerode
- Well-tested at Google scale (Cloud Run, GKE Sandbox)
Cons:
- ~5-15% performance overhead on syscall-heavy workloads
- Some syscalls not supported (affects a few niche apps)
- Cannot run Docker-in-Docker or privileged operations inside container
- Requires installation on every worker node
Verdict: Best isolation-to-overhead ratio. Recommended for sandboxes where agents run arbitrary code. The syscall limitation actually benefits security — agents don't need raw socket access or kernel module loading.
2.3 Sysbox (sysbox-runc) — Docker/Nestybox "VM-like" Containers
What it is: An enhanced OCI runtime (forked from runc) that makes containers behave like lightweight VMs. Acquired by Docker in 2022. Uses Linux user namespaces + partial procfs/sysfs virtualization.
Key capability: Containers can run systemd, Docker, Kubernetes, buildx inside them — without privileged mode, without special images, without Docker socket mounting.
How it works:
- Root in container → unprivileged user on host (always, automatically)
- Virtualizes /proc and /sys inside the container
- Hides host info from container
- Locks container's initial mounts
Installation:
# Debian/Ubuntu
wget https://downloads.nestybox.com/sysbox/releases/v0.6.7/sysbox-ce_0.6.7-0.linux_amd64.deb
sudo dpkg -i sysbox-ce_0.6.7-0.linux_amd64.deb
# Usage
docker run --runtime=sysbox-runc -it ubuntu:22.04
# Inside: you can run systemd, docker, apt install anything
Pros:
- Full VM-like environment — agents get true root with zero host risk
- Docker-in-Docker works natively (agent can build/run containers)
- systemd works (agent can manage services)
- User namespace isolation is automatic and mandatory
- No need for privileged containers ever
- Works alongside other runtimes (runc, gVisor) on same host
- 2x density compared to actual VMs, near-native performance
- Apache 2.0 license, backed by Docker
Cons:
- Weaker isolation than gVisor (still shares kernel, just uses user namespaces)
- Not yet as battle-tested at scale as gVisor
- Requires Linux kernel 5.12+ for best features
- 4+ CPU, 4GB RAM minimum recommended per host
Verdict: Best choice for "full capabilities with no limitations" sandbox environments. This is the closest to the OpenClaw-style full-agent-workspace model. The agent gets a real Linux environment where it can install packages, run Docker, manage services — all without any risk to the host.
2.4 Runtime Comparison Matrix
| Feature | runc (hardened) | gVisor (runsc) | Sysbox |
|---|---|---|---|
| Isolation Level | Medium | High | Medium-High |
| Performance | Native | 85-95% | ~98% |
| Docker-in-Docker | ❌ (needs privileged) | ❌ | ✅ Native |
| systemd support | ❌ | ❌ | ✅ Native |
| Arbitrary code exec | ⚠️ Risky | ✅ Safe | ✅ Safe |
| Agent installs packages | ✅ | ✅ (limited) | ✅ Full |
| Kernel exploit risk | High | Very Low | Low |
| Host requirement | None | runsc binary | sysbox binary |
| K8s compatible | ✅ | ✅ | ✅ |
| Docker Swarm compatible | ✅ | ✅ | ✅ |
Recommendation
Tiered approach:
- Default sandboxes: Use Sysbox (
sysbox-runc) — gives agents full capabilities (Docker-in-Docker, systemd, package installation) with automatic user namespace isolation. This is the "enterprise agent workspace" runtime. - High-security sandboxes: Use gVisor (
runsc) — for executing untrusted/arbitrary code from unknown sources where maximum isolation is needed. - Fallback: Hardened runc with seccomp +
CapDrop: ALL+ user namespace remapping — for hosts where neither Sysbox nor gVisor are installed.
The runtime should be configurable per sandbox via the SandboxCreateOptions:
interface SandboxCreateOptions {
// ... existing fields
runtime?: 'sysbox-runc' | 'runsc' | 'runc'; // default: 'sysbox-runc'
securityProfile?: 'full' | 'restricted' | 'untrusted';
}
3. Orchestration Model
3.1 Direct Docker API via dockerode (Current Approach)
What it is: Create/start/stop/remove containers directly using the Docker Engine API. No orchestrator involved.
Pros:
- Simplest model — full control over container lifecycle
- No additional infrastructure needed
- Works on single Docker host
- Port allocation controlled by our code
- Fast container creation (~1-2s)
- Perfect for ephemeral, per-user sandbox containers
Cons:
- No built-in restart policies (we handle this in code)
- No automatic rescheduling on node failure
- Single host = single point of failure
- No built-in load balancing across hosts
Verdict: Correct approach for sandboxes. Sandboxes are ephemeral, per-user, short-lived containers. They don't need orchestrator features like rolling updates, replicas, or service discovery. Our PostgreSQL-backed state tracking + cleanup workers handle lifecycle management better than Swarm/K8s for this use case.
3.2 Docker Swarm Services API
What it is: Create containers as Swarm Services with declarative state management.
How it would work:
# Instead of docker.createContainer(), use docker.createService()
docker service create --name sandbox-user123-abc \
--replicas 1 \
--constraint node.labels.role==sandbox \
--limit-cpu 1 --limit-memory 512M \
--publish mode=host,target=8080,published=14001 \
--secret db_password \
sandbox-mcp-base
Pros:
- Built-in restart policies and desired state reconciliation
- Multi-node scheduling with placement constraints
- Native secrets management
- Overlay network isolation between services
- Health monitoring built-in
- Rolling updates for base image changes
Cons:
- Overhead for ephemeral containers — Swarm reconciliation loop adds latency
- Port allocation is harder — routing mesh assigns ports, or must use
mode=host - Not designed for per-user ephemeral workloads — services are long-running by design
- More complex cleanup — must
docker service rminstead ofdocker rm - No dockerode support for services API — would need raw HTTP calls or a new library
- Slower creation (~3-5s vs ~1-2s for direct API)
Verdict: Not recommended for sandbox management. Swarm Services are designed for long-running, replicated workloads — not ephemeral per-user containers. The overhead and complexity don't provide value for sandboxes. However, Swarm's overlay networks and secrets management are valuable features we can use independently.
3.3 Kubernetes
What it is: Full container orchestration platform with pods, deployments, namespaces, RBAC, network policies.
How it would work:
- Each sandbox = a Kubernetes Pod
- Per-user isolation via Kubernetes Namespaces
- Network policies for inter-sandbox isolation
- ResourceQuotas for resource limits
- RuntimeClass for gVisor/Sysbox selection
Pros:
- Industry-standard orchestration
- Namespace-level isolation
- Network Policies for fine-grained network control
- RuntimeClass for per-pod runtime selection
- Horizontal Pod Autoscaler for scaling
- Extensive ecosystem
Cons:
- Massive infrastructure overhead — minimum 3 nodes for HA control plane
- Complexity explosion — K8s is an entire platform, not just container management
- Not self-hosted friendly — enterprise users would need K8s expertise
- Overkill for our scale — we manage hundreds of sandboxes, not thousands of microservices
- Different API paradigm — would require rewriting DockerService entirely
- etcd dependency — additional stateful service to manage
Verdict: Not recommended for current deployment model. The infrastrActure is designed for self-hosted deployments on 1-3 Docker nodes. Kubernetes adds enormous operational complexity with minimal benefit at this scale. Consider Kubernetes only if scaling to 1000+ concurrent sandboxes or deploying to cloud-managed K8s (EKS/GKE/AKS).
3.4 Orchestration Recommendation
Stay with Direct Docker API for sandbox management, but enhance it:
- Network isolation: Create a dedicated Docker network per sandbox (or per tenant) instead of using the default bridge
- Secrets: Use Docker Swarm secrets for the infrastrActure itself (already done), but pass sandbox secrets via environment variables at container creation
- Multi-node (future): When scaling beyond one node, use Docker contexts to manage multiple Docker hosts and implement our own placement logic based on resource availability
- Health monitoring: Add periodic health checks from the infrastrActure to sandbox containers, auto-restart or destroy unhealthy ones
4. Base Image Selection
Requirements for Full-Capability Agent Sandboxes
The user's requirement is: "sandbox environments can allow an agent to have full capabilities with no limitations but perfect access — like OpenClaw but more managed and secure"
This means the base image needs:
- Full package manager (apt/apk)
- Common development tools pre-installed
- Language runtimes (Python, Node.js, Go, etc.)
- Git, curl, wget, ssh-client
- Text editors (nano, vim)
- Build tools (gcc, make)
- Docker CLI (for Docker-in-Docker with Sysbox)
- MCP server framework pre-installed
Image Options
Option A: Ubuntu 22.04 LTS (Recommended)
FROM ubuntu:22.04
# Non-interactive installs
ENV DEBIAN_FRONTEND=noninteractive
# Base system
RUN apt-get update && apt-get install -y \
curl wget git vim nano less \
build-essential gcc g++ make cmake \
python3 python3-pip python3-venv \
nodejs npm \
docker.io \
openssh-client rsync \
jq yq tree htop \
ca-certificates gnupg \
&& rm -rf /var/lib/apt/lists/*
# Node.js LTS via nvm
RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
&& apt-get install -y nodejs
# MCP sandbox server
COPY sandbox-mcp-server /usr/local/bin/
EXPOSE 8080
CMD ["sandbox-mcp-server", "--transport", "http", "--port", "8080"]
Size: ~800MB–1.2GB
Pros: Full apt ecosystem, maximum compatibility, LTS support, systemd works with Sysbox
Cons: Larger image, slower pull
Option B: Debian Slim
FROM debian:bookworm-slim
# Similar to Ubuntu but ~30% smaller
Size: ~600MB–900MB
Pros: Smaller than Ubuntu, same apt ecosystem
Cons: Fewer pre-installed tools, less tested with Sysbox docs
Option C: Alpine (NOT Recommended for Sandboxes)
FROM alpine:3.19
Size: ~200MB–400MB
Pros: Tiny base image
Cons: musl libc (breaks many tools), limited apk packages, no systemd, many agent tools fail
Option D: Wolfi/Chainguard (Security-First)
FROM cgr.dev/chainguard/wolfi-base
Size: ~300MB–500MB
Pros: Zero-CVE base, minimal attack surface, apk package manager
Cons: Limited ecosystem, experimental for agent workloads
Image Recommendation
Use Ubuntu 22.04 LTS as the sandbox-mcp-base image:
- Maximum agent compatibility — every tool, language, and library works
- Sysbox official support — Nestybox reference images are Ubuntu-based
- Enterprise familiarity — teams know Ubuntu, can customize
- LTS = 5-year security updates — until April 2027
- systemd support — with Sysbox, enables full service management
- Docker-in-Docker — with Sysbox, agent can build and run containers
Build variant images for specific use cases:
sandbox-mcp-python— Python-focused (conda, jupyter, data science tools)sandbox-mcp-node— Node.js-focused (pnpm, yarn, TypeScript, build tools)sandbox-mcp-full— Everything (large but complete — the default)sandbox-mcp-minimal— Just the MCP server + shell (for restricted sandboxes)
5. Security Hardening — Implementation Plan
5.1 Immediate Changes (DockerService.createContainer)
Add these to the HostConfig when creating sandbox containers:
HostConfig: {
// Existing
PortBindings,
Binds,
// NEW: Security options
SecurityOpt: ['no-new-privileges'],
CapDrop: ['ALL'],
CapAdd: ['CHOWN', 'SETUID', 'SETGID', 'DAC_OVERRIDE', 'FOWNER', 'NET_BIND_SERVICE'],
PidsLimit: 256,
// NEW: Resource limits (from sandbox options)
Memory: parseMem(memoryLimit), // e.g. 512MB
NanoCpus: cpuLimit * 1e9, // e.g. 1 CPU = 1e9
// NEW: Runtime selection
Runtime: runtime ?? 'runc', // 'sysbox-runc' | 'runsc' | 'runc'
// NEW: Network isolation
NetworkMode: sandboxNetworkId, // Per-tenant or per-sandbox network
// NEW: Read-only root filesystem (with tmpfs for writable dirs)
ReadonlyRootfs: securityProfile === 'untrusted',
Tmpfs: securityProfile === 'untrusted' ? { '/tmp': 'rw,noexec,nosuid,size=256m' } : undefined,
}
5.2 Network Isolation
Create an isolated Docker network for each tenant's sandboxes:
async createSandboxNetwork(tenantId: string): Promise<string> {
const network = await this.docker.createNetwork({
Name: `sandbox-net-${tenantId}`,
Driver: 'bridge',
Internal: true, // No internet access by default
Options: { 'com.docker.network.bridge.enable_icc': 'false' },
Labels: { 'mcp.sandbox.tenant': tenantId },
});
return network.id;
}
5.3 Sandbox Security Profiles
type SecurityProfile = 'full' | 'restricted' | 'untrusted';
const SECURITY_PROFILES = {
full: {
// Agent gets maximum capability (with Sysbox runtime)
runtime: 'sysbox-runc',
capDrop: [], // Sysbox handles isolation
readonlyRootfs: false,
networkAccess: true,
pidsLimit: 1024,
},
restricted: {
// Standard sandbox — limited capabilities
runtime: 'runc',
capDrop: ['ALL'],
capAdd: ['CHOWN', 'SETUID', 'SETGID', 'DAC_OVERRIDE'],
readonlyRootfs: false,
networkAccess: true,
pidsLimit: 256,
},
untrusted: {
// Maximum isolation — arbitrary code execution
runtime: 'runsc',
capDrop: ['ALL'],
readonlyRootfs: true,
networkAccess: false,
pidsLimit: 64,
},
};
6. Future Architecture: Multi-Node Sandbox Cluster
For scaling beyond a single Docker host:
Each node:
- Registered in PostgreSQL with capacity/runtime info
- Connected to infrastrActure via Docker context (SSH)
- Reports resource usage periodically
- infrastrActure selects node based on: available resources, required runtime, placement constraints
This avoids Kubernetes complexity while providing multi-node capability using existing Docker infrastructure.
7. Comparison with OpenClaw / Industry Solutions
| Feature | OpenClaw | Docker Sandbox (ours) | E2B | Daytona |
|---|---|---|---|---|
| Isolation | Firecracker microVMs | Sysbox/gVisor containers | Firecracker microVMs | Docker containers |
| Docker-in-Docker | ✅ | ✅ (Sysbox) | ❌ | ✅ |
| Boot time | ~150ms | ~1-2s | ~300ms | ~5-10s |
| Full OS | ✅ | ✅ (Sysbox) | Limited | ✅ |
| Self-hosted | ❌ (SaaS) | ✅ | ❌ (SaaS) | ✅ |
| Max isolation | Very High (VM) | High (gVisor) | Very High (VM) | Medium (runc) |
| Enterprise ready | Yes | Yes (with hardening) | Yes | Partial |
| Cost | Per-minute billing | Infrastructure only | Per-minute billing | Free + infra |
Our advantage: Fully self-hosted, no SaaS dependency, configurable runtime per sandbox, integrated with infrastrActure's tool orchestration. Enterprise customers keep data on their infrastructure.
8. Action Items
Phase 1: Immediate Security Hardening (This Sprint)
- Remove DB_PASSWORD from Dockerfile ENV (build warning fix)
- Add security options to DockerService.createContainer() for sandboxes
- Add PID limits and memory/CPU limits to HostConfig
- Create per-tenant sandbox networks
- Add
securityProfileto SandboxCreateOptions
Implemented: Commit
715a8b7onsecurity/sandbox-hardeningbranch. Three security profiles (standard,restricted,permissive) with CapDrop ALL, no-new-privileges, PID limits, per-tenant bridge networks (Internal + no ICC), and configurable container runtimes.
Phase 2: Runtime Installation (Next Sprint)
- Install Sysbox on sandbox worker node(s) — requires infra team
- Install gVisor (runsc) on sandbox worker node(s) — requires infra team
- Add runtime selection to SandboxService — code ready,
runtimefield passes through to Docker API - Build
sandbox-mcp-fullimage (Ubuntu 22.04 based) - Build
sandbox-mcp-minimalimage (for restricted use) - Test Docker-in-Docker with Sysbox sandboxes
Phase 3: Multi-Node (Future)
- Node registry in PostgreSQL
- Docker context management for multiple hosts
- Resource-aware placement logic
- Health monitoring across nodes