I Built Anthropic's Internal Sandbox Platform on a Single Linux Box

Running Claude Code with full permissions inside a Docker container is a terrible idea. I did it anyway for about a week, then built something better.

Anthropic has an internal platform — people have been calling it Antspace since it got reverse-engineered from the Claude Code source — that runs AI coding tasks in isolated environments. It's part of a vertical stack they're building internally: intent goes in, code comes out, and the agent never touches the host machine.

I wanted that. Not the whole platform-as-a-service thing, just the core idea: give Claude Code a prompt, let it run with zero permission restrictions, stream the output back, grab any files it created, and destroy everything when it's done. On a single Linux box sitting in my office.

The result is about 3,200 lines of Go and 860 lines of TypeScript. It boots a fresh Linux VM in ~4 seconds, runs Claude Code inside it, and tears it down when the task finishes. Three ways to use it: a CLI, a REST API with a web dashboard, and an MCP server so Claude Code on other machines can delegate tasks to it.

This first post is about why I built it this way. Parts 2 and 3 get into the actual implementation.

The container problem

CLAUDE_DANGEROUSLY_SKIP_PERMISSIONS=true — that's the environment variable that tells Claude Code to stop asking before it runs shell commands or writes files. It just does whatever it thinks it needs to. For autonomous tasks, you need this. Claude can't ask for confirmation when there's nobody watching.

The question is where you let it run.

Docker is the obvious first thought. Fast startup, everyone knows it, easy to orchestrate. But containers share the host kernel. Every container on the machine issues syscalls to the same Linux kernel, and a kernel vulnerability is a vulnerability in every container on the host. The isolation boundary is the container runtime, not hardware — and that surface area is big.

For most workloads this is fine. Running a web server in Docker? No worries. But running an AI agent that can execute arbitrary shell commands with root-level permissions? That's a different threat model. A container escape gives you the host. And you've just given the thing inside the container permission to try anything.

Anthropic's own approach to sandboxing Claude Code uses OS-level primitives — bubblewrap on Linux, Seatbelt on macOS — for filesystem and network isolation. They report an 84% reduction in permission prompts internally. That's smart for the normal use case where Claude is helping you write code in your own project. But I wanted something more aggressive: full isolation where even a kernel exploit can't reach the host.

Why Firecracker

Firecracker is what AWS built for Lambda and Fargate. Each MicroVM is a real KVM-backed virtual machine with its own guest kernel, its own memory space, and hardware-enforced isolation via Intel VT-x or AMD-V. The attack surface is the KVM hypervisor — which the kernel team at AWS has spent years minimising.

The trade-off is boot time. Containers start in under a second. Firecracker VMs take about 4 seconds on my hardware once you account for the guest kernel boot, systemd init, and the agent process starting up. For tasks that typically run 20-120 seconds, 4 seconds of overhead is nothing.

Each VM also copies a 4GB rootfs image. Sparse copies make this fast (<1 second), but it does use disk. On a machine with a 1TB NVMe, I'm not losing sleep over it.

The hardware is an AMD Ryzen 5 5600GT with 30GB of RAM. Nothing exotic. About $400 worth of parts sitting under my desk. Each VM gets 2GB of RAM by default, so I can run roughly 12-13 VMs concurrently before the host runs out of memory.

Talking to a VM without a network

This was my favourite bit to figure out.

The obvious way to communicate with a process inside a VM is SSH. Set up keys, open a port, connect over the network. But SSH means key management, an open network port inside the VM, and another service to configure. If the guest's network breaks during a task, you've lost your control channel.

vsock (AF_VSOCK, address family 40) is a kernel-level host-guest communication channel. It doesn't touch the network stack. No IP addresses, no ports, no keys. Firecracker exposes the guest's vsock as a Unix domain socket on the host side — you connect to the socket, send CONNECT <port>\n, and you're talking directly to a process inside the VM.

func Connect(jailID string, port int) (net.Conn, error) {
    socketPath := fmt.Sprintf("/srv/jailer/firecracker/%s/root/vsock.sock", jailID)
    conn, _ := net.Dial("unix", socketPath)
    conn.Write([]byte(fmt.Sprintf("CONNECT %d\n", port)))
    // Read "OK <port>" response
    return conn, nil
}

On the guest side, Go's standard library doesn't support AF_VSOCK — address family 40 doesn't exist in the net package. So the guest agent uses raw syscalls:

fd, _ := syscall.Socket(40, syscall.SOCK_STREAM, 0)  // AF_VSOCK = 40
// Manually construct struct sockaddr_vm (16 bytes)
sa := [16]byte{}
*(*uint16)(unsafe.Pointer(&sa[0])) = 40          // family
*(*uint32)(unsafe.Pointer(&sa[4])) = uint32(port) // port (9001)
*(*uint32)(unsafe.Pointer(&sa[8])) = 0xFFFFFFFF   // VMADDR_CID_ANY
syscall.RawSyscall(syscall.SYS_BIND, uintptr(fd), uintptr(unsafe.Pointer(&sa[0])), 16)
syscall.RawSyscall(syscall.SYS_LISTEN, uintptr(fd), 5, 0)

Yeah, that's unsafe.Pointer and manual struct layout. Not the prettiest Go you'll ever write. But it works, it's fast, and the whole vsock layer is about 160 lines shared between both binaries.

The wire protocol is dead simple — length-prefixed JSON frames:

func WriteFrame(w io.Writer, v interface{}) error {
    data, _ := json.Marshal(v)
    binary.Write(w, binary.BigEndian, uint32(len(data)))
    w.Write(data)
    return nil
}

Each operation (ping, exec, write files, read file) opens a new connection, sends one request, reads the response, and closes. Connection-per-request. Not fancy, but vsock connections are local and effectively instant, so there's no reason to complicate things with multiplexing.

The shape of the thing

The whole system is two Go binaries — the orchestrator (runs on the host) and the agent (runs inside each VM).

graph TD
    subgraph "Host — orchestrator binary"
        API["REST API + WebSocket :8080"]
        MCP["MCP Server :8081"]
        VM["VM Manager"]
        NET["TAP + iptables"]
        TASK["Task Runner"]
        STREAM["Pub/Sub Hub"]
        VSOCK["vsock Client"]
    end

    subgraph "Guest — agent binary"
        AGENT["Guest Agent vsock:9001"]
        CLAUDE["Claude Code"]
    end

    API --> TASK
    MCP --> TASK
    TASK --> VM
    VM --> NET
    TASK --> VSOCK
    VSOCK --> AGENT
    AGENT --> CLAUDE
    TASK --> STREAM
    STREAM --> API

The orchestrator is a single 14MB binary with the React dashboard embedded via //go:embed. Copy it to a server, run it with sudo, done. Seven Go dependencies total — chi for routing, netlink for TAP devices, go-iptables for firewall rules, mcp-go for the MCP protocol, and a few others.

The agent is a 2.5MB static binary compiled with CGO_ENABLED=0. It ships inside the VM's rootfs and starts via systemd on boot. Within about a second of the VM coming up, the agent is listening on vsock port 9001 and ready to accept commands.

They share exactly one file — internal/agent/protocol.go — which defines the wire protocol types and framing functions. Everything else is independent.

What a task looks like

You give it a prompt. It does the rest.

Generate a task ID and VM name
Copy the base rootfs image (sparse, <1 second)
Inject network config into the rootfs
Create a TAP device and iptables rules for internet access
Launch Firecracker via the jailer
Poll vsock until the agent responds (~1 second)
Inject credentials and files via vsock
Run Claude Code with streaming output
Collect any files Claude created
Destroy the VM

From the CLI it looks like this:

sudo ./bin/orchestrator task run \
    --prompt "Write a Python script that generates Fibonacci numbers" \
    --ram 2048 \
    --vcpus 2 \
    --timeout 120

Output streams to your terminal in real time. When it's done:

=== Task Complete ===
ID:     a3bfca80
Status: completed
Exit:   0
Cost:   $0.0582
Files:  [fibonacci.py]

The VM is gone. The rootfs is deleted. The TAP device and iptables rules are cleaned up. All that's left is the result files in /opt/firecracker/results/a3bfca80/.

Or you use the MCP server, and Claude Code on your laptop delegates the task to a VM on the box under your desk. Claude spawning Claude. That bit is properly cool, and I'll get into it in Part 3.

Why Go

Quick aside on this because people always ask.

Go produces static binaries. The agent needs to be a single file with zero dependencies that runs inside a minimal Debian guest — CGO_ENABLED=0 makes this trivial. The orchestrator needs to manage concurrent VMs, and goroutines are a natural fit for that. Syscall support is first-class, which matters when you're doing raw vsock operations. And it compiles in about 2 seconds, which is nice when you're iterating.

build-agent:
	CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o bin/agent -ldflags="-s -w" ./cmd/agent

That -ldflags="-s -w" strips debug info and DWARF tables, dropping the agent binary from ~3.5MB to ~2.5MB. Every byte counts when you're baking it into a rootfs that gets copied for every VM.

Part 2 gets into the actual build — the rootfs, the networking (including a fun bug with Ubuntu's UFW that had me staring at iptables rules for an embarrassing amount of time), the guest agent, and the streaming pipeline that gets Claude's output from inside a VM to your browser.

The container problem

Why Firecracker

Talking to a VM without a network

The shape of the thing

What a task looks like

Why Go

Related posts

Stay in the loop

Comments