Modal vs E2B vs Daytona for agent sandboxesWhere to run the agent when you do not trust it
If you let an agent run shell commands, you want it off your laptop and out of your prod VPC. Three platforms own that space in 2026. They solve different problems.
-
01
Problem
Agent needs shell access
Untrusted code execution requires isolation boundary
not on your laptop -
02
Choice
Modal / E2B / Daytona
Three different isolation and lifecycle models
VM vs gVisor vs OCI -
03
Risk
Secrets inside sandbox
Isolation stops lateral movement, not exfiltration
inject at exec time
TL;DR· the answer, in twenty seconds
What: Modal, E2B, and Daytona each give agents an isolated execution environment, but their isolation models, startup characteristics, and secret-injection stories differ in ways that matter for production use.
Pick: E2B for fast interactive REPL loops where a session-scoped sandbox is fine. Modal for batch jobs and GPU workloads where you need sub-second cold starts and fine-grained pricing. Daytona for agents that need a persistent, full dev environment with filesystem state across runs.
Lesson: Sandbox isolation stops the blast radius. It does not solve the credential problem. An agent in a Firecracker microVM can still exfiltrate an API key it received at startup. Isolation and secret hygiene are separate concerns.
The premise of autonomous coding agents is that they run code. Not mock code, not sandboxed-in-the-marketing-sense code. Actual shell commands. File writes. Package installs. curl to arbitrary endpoints. If you build that loop on your development machine, or inside a VPC where your production database is reachable, you are one confused agent decision away from an incident.
The Replit case from mid-2024 is the canonical example: an agent deleted a production database. The technical postmortem pointed to missing guardrails, but the structural problem was simpler. The agent had credentials and filesystem access it should not have had, and no sandbox boundary stopped it from using them. That failure mode has not gone away. It has scaled. GitGuardian's 2026 State of Secrets Sprawl found that AI-assisted commits leak secrets roughly twice as often as human-only commits. The agents are running more code, and that code is touching more infrastructure.
Three platforms have become the primary targets for teams that want to run agents in genuinely isolated environments: Modal, E2B, and Daytona. They serve different use cases and differ substantially on isolation model, lifecycle, secret handling, and cost. Picking the wrong one is not catastrophic, but it wastes time and money, and some of the differences are security-relevant.
What to know in 60 seconds
- All three run code outside your machine and outside your prod VPC. That is the baseline.
- Modal uses Firecracker microVMs and targets batch compute, serverless functions, and GPU workloads. Cold starts are under 500ms for many configurations. Pricing is per-millisecond of actual compute time.
- E2B is purpose-built for code interpreters and agent REPLs. Each sandbox is a single-session environment. The API is designed around the assumption that an LLM is calling it. Startup is fast (typically under 300ms for a warmed template). Pricing is per sandbox-second.
- Daytona targets persistent dev environments via a workspace SDK. The isolation model is OCI-container-based. Workspaces persist across sessions, which is useful for agents that need filesystem continuity. Pricing is closer to a dev-machine rental than a per-request model.
- None of them solve secret injection by default. You have to think about that separately.
- Egress controls vary significantly. Modal lets you configure network policies. E2B sandboxes have full internet access by default. Daytona inherits whatever the host network policy allows.
Isolation models: what "sandboxed" actually means here
Isolation is not a binary property. The relevant question is: if the agent executes hostile code inside the sandbox, what can it reach?
Modal runs containers inside Firecracker microVMs. Firecracker is Amazon's open-source VMM, used in Lambda and Fargate. A microVM has a real kernel boundary. The guest cannot escape to the host using a container breakout. The attack surface is smaller than a shared-kernel container. Modal provisions one microVM per function invocation unless you configure keep-warm workers. Invocations that reuse a warm worker share a process but not necessarily the same file state, depending on whether you mount persistent volumes.
E2B uses gVisor as its kernel layer. gVisor intercepts syscalls in userspace rather than passing them to the host kernel. It does not give you a full VM boundary, but it blocks the most common container-escape syscalls and limits what a compromised process can do to the host. E2B sandboxes are single-session by design. When the sandbox times out or you call sandbox.kill(), the environment is destroyed. There is no residual state unless you explicitly copy files out before shutdown.
Daytona wraps workspaces in OCI containers, typically running on a dedicated host (their cloud or your own). The isolation model is standard container isolation, which means a kernel shared with the host and whatever other containers are on the same node. Daytona's security model leans on workspace separation and access controls rather than VMM-level isolation. For agents that need a full persistent dev environment, this is usually acceptable. For agents running fully untrusted third-party code, the shallower isolation is worth factoring in.
The practical ranking for isolation strength: Modal (Firecracker) > E2B (gVisor) > Daytona (OCI). That ranking flips somewhat on startup latency, where E2B tends to win for fresh-environment scenarios.
Startup time matters more than people expect
Interactive agent loops are latency-sensitive. If your agent needs to run a Python snippet to verify a hypothesis, a 10-second cold start breaks the feedback loop. The platforms have different startup profiles.
E2B is optimized for this case. They maintain a pool of pre-initialized sandbox templates. A fresh sandbox from a warmed template starts in 300ms or less. Custom templates (where you install packages or configure a specific runtime) take longer on first build but cache after that. For an agent that spins up many short-lived sandboxes in sequence, E2B's model is well-matched.
Modal keeps warm workers running between invocations if you configure keep_warm. With no warming, a Modal function cold-starts in around 400-600ms for a Python container with common dependencies. GPU containers take longer, particularly if they load large model weights. Modal's tradeoff is that warm workers cost money even when idle. If your agent workload is bursty, you end up choosing between cold-start latency and idle compute cost.
Daytona workspaces start once and persist. The relevant latency is not cold-start but workspace-reconnect time, which is typically seconds rather than milliseconds. For a workflow where the agent returns to the same environment across a multi-hour session, this is fine. For a workflow where you spin up hundreds of isolated environments for parallel agent runs, Daytona's workspace model is not the right shape.
Network egress and why you should care
Sandboxed does not mean network-isolated unless you configure it that way.
An agent running inside a Modal function has full outbound internet access by default. Modal lets you use network file systems and configure VPC peering, but there is no built-in egress firewall at the sandbox level. If your agent is running code that wants to exfiltrate data via DNS or HTTPS, the sandbox does not stop it.
E2B sandboxes have full internet access and no built-in egress filtering. This is a deliberate design choice: the platform assumes you want the agent to be able to install packages, call APIs, and perform web research. If you need to constrain that, you need to do it at the network layer outside the sandbox, or use E2B's enterprise tier which supports custom networking configurations.
Daytona workspaces inherit the network configuration of the host. On Daytona's cloud, this is an open internet connection. In self-hosted Daytona, you can configure the host's network policy to restrict egress. This is more operationally complex, but it is the path if you need to say "this agent cannot reach our internal systems and cannot exfiltrate to arbitrary endpoints."
For high-sensitivity workloads, none of the three gives you push-button egress restriction. You are doing network policy work yourself, either via cloud provider tooling or by running the platform in a controlled environment.
Secret injection: the part that actually trips teams up
Isolation stops the blast radius. It does not automatically protect the secrets the agent needs to do its job. If you inject an AWS credential at sandbox startup as an environment variable, and the agent writes it to a file inside the sandbox, and that file gets included in an artifact the sandbox returns, you have leaked the credential. The sandbox wall is intact; the data went through a legitimate channel.
Each platform has its own approach to secrets.
Modal ships Modal Secrets as a first-class concept. You create a Secret object in Modal's dashboard or CLI, give it a name, and attach it to a function or container. The secret values appear as environment variables inside the function. Critically, you reference the secret by name in code rather than putting the value in code. The secret is fetched from Modal's encrypted store at invocation time. This is better than hardcoding, but the value still arrives as a plaintext environment variable inside the container. If the code running inside that container writes os.environ to a log file, or an LLM response includes the value it received in context, the credential is now in a place you did not intend. Modal Secrets are organizational-level objects: you can share them across projects, which is convenient and occasionally dangerous.
E2B handles secrets via environment variables passed at sandbox creation time. The E2B Python SDK call looks like sandbox = Sandbox(template="base", env_vars={"MY_API_KEY": os.environ["MY_API_KEY"]}). There is no dedicated secrets store inside E2B. You are expected to fetch the secret from whatever store you use (AWS Secrets Manager, GCP Secret Manager, your CI secrets, etc.) in the orchestrating process, then pass it into the sandbox at startup. This is the right pattern: the orchestrator holds long-lived credentials, the sandbox receives only what it needs for this session, and the sandbox is destroyed when the session ends. The weakness is that the secret still arrives as a plaintext env var inside the sandbox. If you are concerned about the agent reading and retransmitting its own environment, you need controls inside the sandbox to prevent that.
Daytona handles workspace secrets through workspace-level environment configuration. Secrets can be set per-workspace and persist across workspace restarts, which fits the persistent-environment model but is a weaker isolation story than E2B's session-scoped approach. A secret set on a Daytona workspace is present for the lifetime of the workspace. If the workspace persists for weeks, the secret is inside the workspace for weeks.
The practical hierarchy: E2B's session-scoped injection model is the most disciplined about secret lifetime. Modal Secrets are convenient and broadly available, which is useful and occasionally a vector. Daytona's persistent secrets match the persistent-workspace use case but require more careful rotation discipline.
Persistence: state across runs
This dimension drives the use-case split more than any other.
Modal is stateless by default. A function invocation gets an empty container filesystem. Persist data by mounting a Modal Volume or writing to cloud storage. The function itself holds no state between invocations unless you use a shared Volume. For agent workflows that are self-contained per invocation, this is clean. For agent workflows that build up workspace state over time (installing packages, writing files, modifying configs), stateless invocations mean re-running setup on every call.
E2B sandboxes are stateless by design. Each sandbox starts from a template. Data written inside the sandbox dies with the sandbox unless you explicitly call the file-read API to extract it before shutdown. E2B provides a filesystem API that lets you read files out of the sandbox or write files into it. For an interactive REPL loop where the agent generates some output, reads it, and the session ends, this works well. For workflows that need to remember which packages are installed, or which files were created, you are managing that state externally and re-injecting it at sandbox startup.
Daytona workspaces persist. Files survive between agent sessions. Installed packages survive. Git state survives. This is the defining advantage for full-dev-environment use cases. An agent that is working on a coding task over multiple hours, or across multiple sessions, can return to the same workspace and pick up where it left off. The cost is that the sandbox is less disposable: if the agent corrupts the workspace state, you deal with a degraded environment rather than simply spinning up a fresh one.
Cost model in practice
Pricing structures differ enough that the "cheapest" option depends entirely on your workload shape.
Modal charges per millisecond of CPU (and per millisecond of GPU for GPU instances). If your agent runs a 2-second Python function, you pay for 2 seconds. Idle time between invocations costs nothing unless you are running keep-warm workers. For bursty batch workloads where many invocations run and finish quickly, Modal's pricing is hard to beat. For workloads where you need a long-running worker idling between agent calls, the math changes.
E2B charges per sandbox-second. A sandbox that exists for 30 seconds costs 30 sandbox-seconds regardless of whether the agent is actively executing code for all 30 seconds. If your agent creates a sandbox, runs a 2-second operation, and then lets the sandbox sit idle for 28 more seconds while it thinks about the next step, you pay for 30 seconds. The design pushes you toward tight sandbox lifecycles: create, use, destroy. That discipline is good for secret hygiene too.
Daytona charges more like infrastructure rental. You pay for the workspace compute (CPU/RAM/storage) for as long as the workspace is running. For workloads that need long-lived environments, this can be cheaper than spinning up fresh sandboxes constantly. For workloads that only need the environment for short windows, it is more expensive.
What actually gets missed
The sandbox-as-security discussion tends to focus on the isolation layer and skip two things that matter more in practice.
First: exfiltration through legitimate channels. A Firecracker VM gives you a strong isolation boundary. It does not prevent an agent from making an HTTPS call to an attacker-controlled endpoint using credentials it received legitimately. If you inject a secret into a sandbox and the agent uses that secret in a network call that goes somewhere unexpected, the sandbox did not fail. The secret moved through a legitimate channel. Blocking this requires egress filtering or, better, not giving the agent more credential scope than it needs for the specific operation.
Second: the host-side trust problem. When you call e2b.Sandbox() or modal.Function.lookup() from your orchestrating process, that orchestrating process has your real credentials. The orchestrator is not sandboxed. If your orchestrator is a long-running server that holds credentials in memory, and that server is compromised, the sandboxes are irrelevant. OX Security's 2026 MCP ecosystem analysis found that most agent orchestration systems concentrate significant credential access in the orchestrating process, regardless of how isolated the downstream execution environments are. The sandbox is the right place to run the untrusted code. The orchestrator is the right place to hold credentials. Those are different layers with different trust models, and conflating them is where most real-world agent security failures live.
A checklist to paste into your agent setup PR
## Sandbox security baseline
- [ ] Agent code runs in Modal, E2B, or Daytona -- not on dev laptop, not in prod VPC
- [ ] Sandbox created fresh per task (E2B/Modal) or workspace is isolated per project (Daytona)
- [ ] Secrets NOT hardcoded in container image or workspace config
- [ ] Secrets injected at invocation time from an external store (AWS Secrets Manager, GCP Secret Manager, etc.)
- [ ] Secret scope is minimal -- only the specific key(s) this agent task needs
- [ ] Sandbox egress reviewed -- confirm outbound network access is appropriate for workload
- [ ] Sandbox lifetime is bounded -- no idle sandboxes accumulating runtime cost and secret exposure
- [ ] Output artifacts scanned before returning to orchestrator (no credential values in generated files)
- [ ] Orchestrator itself runs with minimal credentials -- not the same set it injects into sandboxes
- [ ] Audit log exists for which sandbox ran which task with which secrets attached
What this means for your stack
Running agents in a real sandbox is a prerequisite, not a complete solution. Modal, E2B, and Daytona each handle the execution isolation well for their target use case. The gap they leave is the credential story: secrets arrive at sandbox startup as plaintext env vars, the sandbox has no mechanism to prevent the agent from reading and retransmitting them, and the orchestrator side that holds long-lived credentials is outside the sandbox entirely.
A local secret broker closes that gap by sitting between your credential store and the agent invocation. The broker fetches only what a specific task needs, injects the value into the child process at exec time, and records each grant to an append-only audit log. hasp is one working implementation. curl -fsSL https://gethasp.com/install.sh | sh, hasp setup, connect a project, and the orchestrator hands the next sandbox invocation a reference instead of a raw key. Source-available (FCL-1.0), local-first, macOS and Linux, no account.
The sandbox handles isolation. The broker handles credential scope and audit trail. Neither replaces the other.
Sources· cited above, in one place
- Anthropic Security advisories and Claude Code release notes
- GitGuardian State of Secrets Sprawl report
- AWS Secrets Manager Documentation
- Google Cloud Secret Manager Documentation
- Azure Key Vault Documentation
- GitGuardian Labs Secrets-in-code research blog
- OX Security AppSec research, including MCP ecosystem analysis
Stop handing the agent your real keys.
hasp keeps secrets in one local encrypted vault, brokers them into the child process at exec, and never lets the agent read the value.
- Local, encrypted vault — no account, no cloud, no telemetry by default.
- Brokered run — agent gets a reference, the child process gets the value.
- Pre-commit + pre-push hooks catch managed values before they ship.
- Append-only HMAC audit log answers "did the agent touch the prod token?" in seconds.
macOS & Linux. Source-available (FCL-1.0, converts to Apache 2.0). No account.