Sandbox Security Model — Research & Recommendations

Status: Phase 1 Implemented (security/sandbox-hardening branch) · Phase 2 Pending (infra team)
Date: 2025-02-18 (research) · 2025-02-19 (Phase 1 implementation)
Author: Agent-assisted research
Scope: infrastrActure sandbox container security, orchestration, and base image selection
Stable Baseline: v1.0.0 (tag df14030, Docker image infrastrActure:v1.0.0)


1. Current Architecture

What We Have Today

The infrastrActure creates sandbox containers using the Docker Engine API directly via dockerode. Each sandbox:

  • Runs a sandbox-mcp-base image (custom)
  • Gets a unique host port (range 14000–14999, ~1000 concurrent sandboxes)
  • Has configurable memory (default 512MB) and CPU limits (default 1 core)
  • Has idle timeout (30 min) and max lifetime (24h)
  • Lifecycle: pending → creating → running → paused → stopped → destroyed
  • State tracked in PostgreSQL (sandboxes table)
  • Container managed via DockerService.createContainer() + performContainerAction()

Security Gaps Identified

GapCurrent StateRisk LevelStatus
No seccomp profileContainers use Docker default seccompMedium⏳ Open
No capability droppingAll default capabilities retainedHighFixed — CapDrop ALL + minimal CapAdd
No gVisor/Sysbox runtimeStandard runc — shared kernelMedium⏳ Code ready, needs install on nodes
No network isolationAll sandboxes on default bridge networkHighFixed — Per-tenant bridge, Internal, no ICC
No readonly rootfsContainer filesystem fully writableLowFixedrestricted profile uses readonly + tmpfs
No PID limitsUnlimited processes inside containerMediumFixed — 64/256/1024 per profile
No user namespace remappingRoot in container = potential host rootHigh⏳ Sysbox handles this (Phase 2)
Docker socket not isolatedinfrastrActure has Docker socket accessMedium⏳ Architectural (manager needs socket)

Implementation Summary (Phase 1)

The following diagram shows the security hardening applied to every sandbox container as of the Phase 1 implementation:

Files changed: src/services/sandbox.ts (profiles, create flow), src/services/docker.ts (HostConfig fields, network CRUD)


2. Container Runtime Options

2.1 runc (Current — Docker Default)

What it is: The standard OCI container runtime. Uses Linux namespaces and cgroups for isolation.

Pros:

  • Zero overhead — maximum performance
  • Universal compatibility
  • Well-understood security model
  • Default on all Docker installations

Cons:

  • Shared kernel — a kernel exploit = host compromise
  • Default capabilities are generous (13+ capabilities)
  • No syscall interception beyond seccomp profiles

Verdict: Sufficient for most enterprise use cases if properly hardened with seccomp + capability dropping + user namespace remapping.


2.2 gVisor (runsc) — Google's Application Kernel

What it is: A user-space kernel written in Go that intercepts all syscalls from containers. The container never directly touches the host kernel. OCI-compatible runtime that replaces runc.

How it works:

  • Sentry (user-space kernel) handles syscalls
  • Gofer handles filesystem operations
  • Application → Sentry → limited host kernel calls

Installation:

# On the Docker host (mcp-server worker node)
sudo apt-get install -y runsc
sudo runsc install  # Adds runtime to /etc/docker/daemon.json
sudo systemctl restart docker

# Usage
docker run --runtime=runsc myimage

Pros:

  • Strongest syscall isolation without VMs
  • Each container gets its own application kernel
  • Supports ~380 of ~450 Linux syscalls (covers 99% of real workloads)
  • Compatible with Docker, Kubernetes, dockerode
  • Well-tested at Google scale (Cloud Run, GKE Sandbox)

Cons:

  • ~5-15% performance overhead on syscall-heavy workloads
  • Some syscalls not supported (affects a few niche apps)
  • Cannot run Docker-in-Docker or privileged operations inside container
  • Requires installation on every worker node

Verdict: Best isolation-to-overhead ratio. Recommended for sandboxes where agents run arbitrary code. The syscall limitation actually benefits security — agents don't need raw socket access or kernel module loading.


2.3 Sysbox (sysbox-runc) — Docker/Nestybox "VM-like" Containers

What it is: An enhanced OCI runtime (forked from runc) that makes containers behave like lightweight VMs. Acquired by Docker in 2022. Uses Linux user namespaces + partial procfs/sysfs virtualization.

Key capability: Containers can run systemd, Docker, Kubernetes, buildx inside them — without privileged mode, without special images, without Docker socket mounting.

How it works:

  • Root in container → unprivileged user on host (always, automatically)
  • Virtualizes /proc and /sys inside the container
  • Hides host info from container
  • Locks container's initial mounts

Installation:

# Debian/Ubuntu
wget https://downloads.nestybox.com/sysbox/releases/v0.6.7/sysbox-ce_0.6.7-0.linux_amd64.deb
sudo dpkg -i sysbox-ce_0.6.7-0.linux_amd64.deb

# Usage
docker run --runtime=sysbox-runc -it ubuntu:22.04
# Inside: you can run systemd, docker, apt install anything

Pros:

  • Full VM-like environment — agents get true root with zero host risk
  • Docker-in-Docker works natively (agent can build/run containers)
  • systemd works (agent can manage services)
  • User namespace isolation is automatic and mandatory
  • No need for privileged containers ever
  • Works alongside other runtimes (runc, gVisor) on same host
  • 2x density compared to actual VMs, near-native performance
  • Apache 2.0 license, backed by Docker

Cons:

  • Weaker isolation than gVisor (still shares kernel, just uses user namespaces)
  • Not yet as battle-tested at scale as gVisor
  • Requires Linux kernel 5.12+ for best features
  • 4+ CPU, 4GB RAM minimum recommended per host

Verdict: Best choice for "full capabilities with no limitations" sandbox environments. This is the closest to the OpenClaw-style full-agent-workspace model. The agent gets a real Linux environment where it can install packages, run Docker, manage services — all without any risk to the host.


2.4 Runtime Comparison Matrix

Featurerunc (hardened)gVisor (runsc)Sysbox
Isolation LevelMediumHighMedium-High
PerformanceNative85-95%~98%
Docker-in-Docker❌ (needs privileged)✅ Native
systemd support✅ Native
Arbitrary code exec⚠️ Risky✅ Safe✅ Safe
Agent installs packages✅ (limited)✅ Full
Kernel exploit riskHighVery LowLow
Host requirementNonerunsc binarysysbox binary
K8s compatible
Docker Swarm compatible

Recommendation

Tiered approach:

  1. Default sandboxes: Use Sysbox (sysbox-runc) — gives agents full capabilities (Docker-in-Docker, systemd, package installation) with automatic user namespace isolation. This is the "enterprise agent workspace" runtime.
  2. High-security sandboxes: Use gVisor (runsc) — for executing untrusted/arbitrary code from unknown sources where maximum isolation is needed.
  3. Fallback: Hardened runc with seccomp + CapDrop: ALL + user namespace remapping — for hosts where neither Sysbox nor gVisor are installed.

The runtime should be configurable per sandbox via the SandboxCreateOptions:

interface SandboxCreateOptions {
  // ... existing fields
  runtime?: 'sysbox-runc' | 'runsc' | 'runc';  // default: 'sysbox-runc'
  securityProfile?: 'full' | 'restricted' | 'untrusted';
}

3. Orchestration Model

3.1 Direct Docker API via dockerode (Current Approach)

What it is: Create/start/stop/remove containers directly using the Docker Engine API. No orchestrator involved.

Pros:

  • Simplest model — full control over container lifecycle
  • No additional infrastructure needed
  • Works on single Docker host
  • Port allocation controlled by our code
  • Fast container creation (~1-2s)
  • Perfect for ephemeral, per-user sandbox containers

Cons:

  • No built-in restart policies (we handle this in code)
  • No automatic rescheduling on node failure
  • Single host = single point of failure
  • No built-in load balancing across hosts

Verdict: Correct approach for sandboxes. Sandboxes are ephemeral, per-user, short-lived containers. They don't need orchestrator features like rolling updates, replicas, or service discovery. Our PostgreSQL-backed state tracking + cleanup workers handle lifecycle management better than Swarm/K8s for this use case.


3.2 Docker Swarm Services API

What it is: Create containers as Swarm Services with declarative state management.

How it would work:

# Instead of docker.createContainer(), use docker.createService()
docker service create --name sandbox-user123-abc \
  --replicas 1 \
  --constraint node.labels.role==sandbox \
  --limit-cpu 1 --limit-memory 512M \
  --publish mode=host,target=8080,published=14001 \
  --secret db_password \
  sandbox-mcp-base

Pros:

  • Built-in restart policies and desired state reconciliation
  • Multi-node scheduling with placement constraints
  • Native secrets management
  • Overlay network isolation between services
  • Health monitoring built-in
  • Rolling updates for base image changes

Cons:

  • Overhead for ephemeral containers — Swarm reconciliation loop adds latency
  • Port allocation is harder — routing mesh assigns ports, or must use mode=host
  • Not designed for per-user ephemeral workloads — services are long-running by design
  • More complex cleanup — must docker service rm instead of docker rm
  • No dockerode support for services API — would need raw HTTP calls or a new library
  • Slower creation (~3-5s vs ~1-2s for direct API)

Verdict: Not recommended for sandbox management. Swarm Services are designed for long-running, replicated workloads — not ephemeral per-user containers. The overhead and complexity don't provide value for sandboxes. However, Swarm's overlay networks and secrets management are valuable features we can use independently.


3.3 Kubernetes

What it is: Full container orchestration platform with pods, deployments, namespaces, RBAC, network policies.

How it would work:

  • Each sandbox = a Kubernetes Pod
  • Per-user isolation via Kubernetes Namespaces
  • Network policies for inter-sandbox isolation
  • ResourceQuotas for resource limits
  • RuntimeClass for gVisor/Sysbox selection

Pros:

  • Industry-standard orchestration
  • Namespace-level isolation
  • Network Policies for fine-grained network control
  • RuntimeClass for per-pod runtime selection
  • Horizontal Pod Autoscaler for scaling
  • Extensive ecosystem

Cons:

  • Massive infrastructure overhead — minimum 3 nodes for HA control plane
  • Complexity explosion — K8s is an entire platform, not just container management
  • Not self-hosted friendly — enterprise users would need K8s expertise
  • Overkill for our scale — we manage hundreds of sandboxes, not thousands of microservices
  • Different API paradigm — would require rewriting DockerService entirely
  • etcd dependency — additional stateful service to manage

Verdict: Not recommended for current deployment model. The infrastrActure is designed for self-hosted deployments on 1-3 Docker nodes. Kubernetes adds enormous operational complexity with minimal benefit at this scale. Consider Kubernetes only if scaling to 1000+ concurrent sandboxes or deploying to cloud-managed K8s (EKS/GKE/AKS).


3.4 Orchestration Recommendation

Stay with Direct Docker API for sandbox management, but enhance it:

  1. Network isolation: Create a dedicated Docker network per sandbox (or per tenant) instead of using the default bridge
  2. Secrets: Use Docker Swarm secrets for the infrastrActure itself (already done), but pass sandbox secrets via environment variables at container creation
  3. Multi-node (future): When scaling beyond one node, use Docker contexts to manage multiple Docker hosts and implement our own placement logic based on resource availability
  4. Health monitoring: Add periodic health checks from the infrastrActure to sandbox containers, auto-restart or destroy unhealthy ones

4. Base Image Selection

Requirements for Full-Capability Agent Sandboxes

The user's requirement is: "sandbox environments can allow an agent to have full capabilities with no limitations but perfect access — like OpenClaw but more managed and secure"

This means the base image needs:

  • Full package manager (apt/apk)
  • Common development tools pre-installed
  • Language runtimes (Python, Node.js, Go, etc.)
  • Git, curl, wget, ssh-client
  • Text editors (nano, vim)
  • Build tools (gcc, make)
  • Docker CLI (for Docker-in-Docker with Sysbox)
  • MCP server framework pre-installed

Image Options

FROM ubuntu:22.04

# Non-interactive installs
ENV DEBIAN_FRONTEND=noninteractive

# Base system
RUN apt-get update && apt-get install -y \
    curl wget git vim nano less \
    build-essential gcc g++ make cmake \
    python3 python3-pip python3-venv \
    nodejs npm \
    docker.io \
    openssh-client rsync \
    jq yq tree htop \
    ca-certificates gnupg \
    && rm -rf /var/lib/apt/lists/*

# Node.js LTS via nvm
RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
    && apt-get install -y nodejs

# MCP sandbox server
COPY sandbox-mcp-server /usr/local/bin/
EXPOSE 8080
CMD ["sandbox-mcp-server", "--transport", "http", "--port", "8080"]

Size: ~800MB–1.2GB
Pros: Full apt ecosystem, maximum compatibility, LTS support, systemd works with Sysbox
Cons: Larger image, slower pull

Option B: Debian Slim

FROM debian:bookworm-slim
# Similar to Ubuntu but ~30% smaller

Size: ~600MB–900MB
Pros: Smaller than Ubuntu, same apt ecosystem
Cons: Fewer pre-installed tools, less tested with Sysbox docs

FROM alpine:3.19

Size: ~200MB–400MB
Pros: Tiny base image
Cons: musl libc (breaks many tools), limited apk packages, no systemd, many agent tools fail

Option D: Wolfi/Chainguard (Security-First)

FROM cgr.dev/chainguard/wolfi-base

Size: ~300MB–500MB
Pros: Zero-CVE base, minimal attack surface, apk package manager
Cons: Limited ecosystem, experimental for agent workloads

Image Recommendation

Use Ubuntu 22.04 LTS as the sandbox-mcp-base image:

  1. Maximum agent compatibility — every tool, language, and library works
  2. Sysbox official support — Nestybox reference images are Ubuntu-based
  3. Enterprise familiarity — teams know Ubuntu, can customize
  4. LTS = 5-year security updates — until April 2027
  5. systemd support — with Sysbox, enables full service management
  6. Docker-in-Docker — with Sysbox, agent can build and run containers

Build variant images for specific use cases:

  • sandbox-mcp-python — Python-focused (conda, jupyter, data science tools)
  • sandbox-mcp-node — Node.js-focused (pnpm, yarn, TypeScript, build tools)
  • sandbox-mcp-full — Everything (large but complete — the default)
  • sandbox-mcp-minimal — Just the MCP server + shell (for restricted sandboxes)

5. Security Hardening — Implementation Plan

5.1 Immediate Changes (DockerService.createContainer)

Add these to the HostConfig when creating sandbox containers:

HostConfig: {
  // Existing
  PortBindings,
  Binds,
  
  // NEW: Security options
  SecurityOpt: ['no-new-privileges'],
  CapDrop: ['ALL'],
  CapAdd: ['CHOWN', 'SETUID', 'SETGID', 'DAC_OVERRIDE', 'FOWNER', 'NET_BIND_SERVICE'],
  PidsLimit: 256,
  
  // NEW: Resource limits (from sandbox options)
  Memory: parseMem(memoryLimit),       // e.g. 512MB
  NanoCpus: cpuLimit * 1e9,            // e.g. 1 CPU = 1e9
  
  // NEW: Runtime selection
  Runtime: runtime ?? 'runc',          // 'sysbox-runc' | 'runsc' | 'runc'
  
  // NEW: Network isolation
  NetworkMode: sandboxNetworkId,       // Per-tenant or per-sandbox network
  
  // NEW: Read-only root filesystem (with tmpfs for writable dirs)
  ReadonlyRootfs: securityProfile === 'untrusted',
  Tmpfs: securityProfile === 'untrusted' ? { '/tmp': 'rw,noexec,nosuid,size=256m' } : undefined,
}

5.2 Network Isolation

Create an isolated Docker network for each tenant's sandboxes:

async createSandboxNetwork(tenantId: string): Promise<string> {
  const network = await this.docker.createNetwork({
    Name: `sandbox-net-${tenantId}`,
    Driver: 'bridge',
    Internal: true,  // No internet access by default
    Options: { 'com.docker.network.bridge.enable_icc': 'false' },
    Labels: { 'mcp.sandbox.tenant': tenantId },
  });
  return network.id;
}

5.3 Sandbox Security Profiles

type SecurityProfile = 'full' | 'restricted' | 'untrusted';

const SECURITY_PROFILES = {
  full: {
    // Agent gets maximum capability (with Sysbox runtime)
    runtime: 'sysbox-runc',
    capDrop: [],  // Sysbox handles isolation
    readonlyRootfs: false,
    networkAccess: true,
    pidsLimit: 1024,
  },
  restricted: {
    // Standard sandbox — limited capabilities
    runtime: 'runc',
    capDrop: ['ALL'],
    capAdd: ['CHOWN', 'SETUID', 'SETGID', 'DAC_OVERRIDE'],
    readonlyRootfs: false,
    networkAccess: true,
    pidsLimit: 256,
  },
  untrusted: {
    // Maximum isolation — arbitrary code execution
    runtime: 'runsc',
    capDrop: ['ALL'],
    readonlyRootfs: true,
    networkAccess: false,
    pidsLimit: 64,
  },
};

6. Future Architecture: Multi-Node Sandbox Cluster

For scaling beyond a single Docker host:

Each node:

  • Registered in PostgreSQL with capacity/runtime info
  • Connected to infrastrActure via Docker context (SSH)
  • Reports resource usage periodically
  • infrastrActure selects node based on: available resources, required runtime, placement constraints

This avoids Kubernetes complexity while providing multi-node capability using existing Docker infrastructure.


7. Comparison with OpenClaw / Industry Solutions

FeatureOpenClawDocker Sandbox (ours)E2BDaytona
IsolationFirecracker microVMsSysbox/gVisor containersFirecracker microVMsDocker containers
Docker-in-Docker✅ (Sysbox)
Boot time~150ms~1-2s~300ms~5-10s
Full OS✅ (Sysbox)Limited
Self-hosted❌ (SaaS)❌ (SaaS)
Max isolationVery High (VM)High (gVisor)Very High (VM)Medium (runc)
Enterprise readyYesYes (with hardening)YesPartial
CostPer-minute billingInfrastructure onlyPer-minute billingFree + infra

Our advantage: Fully self-hosted, no SaaS dependency, configurable runtime per sandbox, integrated with infrastrActure's tool orchestration. Enterprise customers keep data on their infrastructure.


8. Action Items

Phase 1: Immediate Security Hardening (This Sprint)

  • Remove DB_PASSWORD from Dockerfile ENV (build warning fix)
  • Add security options to DockerService.createContainer() for sandboxes
  • Add PID limits and memory/CPU limits to HostConfig
  • Create per-tenant sandbox networks
  • Add securityProfile to SandboxCreateOptions

Implemented: Commit 715a8b7 on security/sandbox-hardening branch. Three security profiles (standard, restricted, permissive) with CapDrop ALL, no-new-privileges, PID limits, per-tenant bridge networks (Internal + no ICC), and configurable container runtimes.

Phase 2: Runtime Installation (Next Sprint)

  • Install Sysbox on sandbox worker node(s) — requires infra team
  • Install gVisor (runsc) on sandbox worker node(s) — requires infra team
  • Add runtime selection to SandboxService — code ready, runtime field passes through to Docker API
  • Build sandbox-mcp-full image (Ubuntu 22.04 based)
  • Build sandbox-mcp-minimal image (for restricted use)
  • Test Docker-in-Docker with Sysbox sandboxes

Phase 3: Multi-Node (Future)

  • Node registry in PostgreSQL
  • Docker context management for multiple hosts
  • Resource-aware placement logic
  • Health monitoring across nodes

References