GUIDE · CONCEPT 10 min ·

Blast radius doctrine for AI agentsDesign for the smallest credible damage.

When an agent goes sideways, blast radius determines the damage ceiling. Most teams only think about this after the ceiling falls on them.

TL;DR· the answer, in twenty seconds

What: Blast radius is the worst-case damage surface when an agent credential leaks or an agent action goes wrong. It has three independent dimensions: credential scope, process scope, and action reversibility.

Fix: Design each agent task with the narrowest credential that works, default actions to dry-run, and separate staging from production access at the credential level, not just the config level.

Lesson: Agents make mistakes. The design question is how much damage one mistake can cause before a human can intervene. Blast radius sets that ceiling, and it is set at design time.

An AI agent that deletes a table cannot delete a table it has no permission to touch. An agent that spins up GPUs cannot run a $82,000 bill if its key has a cost cap. The constraint is not magic. It is just geometry: the blast radius of any failure is bounded by the surface the agent can reach.

Most teams do not think about this surface until after something goes wrong. They grant the agent whatever credential is convenient, run it in the same shell session that has production access, and default every tool to apply mode. Then the agent makes one bad call, and the question "how much damage can this do?" matters more than the original task.

Blast radius doctrine is the discipline of asking that question before the call happens.

The short version

  • Blast radius has three dimensions: credential scope, process scope, and action reversibility. They are independent. Reducing one does not reduce the others.
  • Read-only credentials cannot cause write damage. Admin credentials can reach the whole account.
  • A dry-run default catches more errors than a confirmation prompt, because confirmation prompts get clicked through.
  • The Replit production database delete, the $82K GCP bill, and the Claude Code February 2026 credential leak all had large blast radii that a single upstream constraint would have capped.
  • The doctrine has real limits. Tightening too far creates pressure to work around the controls.

Three dimensions of blast radius

Credential blast radius

A credential grants the agent a surface it can reach. The wider that surface, the more damage any one action can cause.

The useful spectrum runs from narrowest to widest:

  • A read-only token scoped to one S3 bucket. Worst case: the agent reads data it should not. No writes, no cross-resource access.
  • A database user with write access to one schema. Worst case: bad data written. Recoverable if you have backups.
  • A GCP service account with editor on one project. Worst case: every resource in that project is reachable. Compute, storage, IAM, all of it.
  • A root AWS key or a GitHub token with repo:*. Worst case is the whole account.

Two other axes matter as much as scope. First: is the credential rotatable? A 30-day expiry means a leaked key has a bounded lifetime. A root account password has none. Second: is the credential for staging or production? A staging credential that looks exactly like a production credential in your shell history is a rotation accident away from disaster.

The February 2026 Knostic disclosure of Claude Code's settings.local.json leak is a credential blast radius story. The file captured every environment variable the agent's child process could see, which in most developer shells meant the full set: Stripe keys, OpenAI keys, GitHub tokens, database URLs. The blast radius was not one credential. It was the entire dev shell.

Process blast radius

A credential grants a surface. The process scope determines how long the agent holds it and what else runs alongside it.

Three process postures, from safest to most common:

  • Per-task: credentials are injected for one command, in a child process, and vanish when it exits. Nothing persists to the session or the filesystem.
  • Per-session: credentials arrive at session start and stay for the entire session. One bad call in hour three has the same access as the first call in hour one.
  • Per-user: the agent inherits the full ambient shell environment. Every key in ~/.zshrc, every export from the last six months of development, available to every command.

The per-user posture is the default for most agent setups today, because it requires no extra work. That is precisely the problem.

Action blast radius

Even a well-scoped credential can cause serious damage if the action it takes is irreversible.

Three axes determine how much an action can damage:

  • Dry-run vs. apply. terraform plan has zero action blast radius. terraform apply has the full credential blast radius. Default to apply, require an explicit flag for dry-run, and every agent mistake is live.
  • Staging vs. production. Staging exists to absorb mistakes. If your agent can reach production directly, staging is cosmetic. The agent does not know the difference between the two unless the credential difference makes it impossible to ignore.
  • Reversible vs. not. Deleting a row you can restore from a point-in-time snapshot is painful. Dropping a table with cascade in a database where the backup job has been silently failing for three weeks is not recoverable. Same credential scope. Completely different action blast radius.

The combination of wide credential scope, per-session process posture, and apply-default action posture is what produces catastrophic outcomes. No single dimension causes the incident. All three together remove every checkpoint between the agent's decision and the damage.

Where the doctrine breaks down

Read-only credentials, per-task process scope, and dry-run defaults are not always feasible. Useful agents write things. Useful CI pipelines deploy to production. If you tighten to absurdity, engineers route around the controls.

Four places the doctrine runs into real friction:

Legitimate workflows need write access. An agent that opens pull requests needs write access to the repo. An agent that manages infrastructure needs apply. You cannot reduce credential scope to read-only and still have a working pipeline. The goal is the smallest scope that covers the real task, not zero scope.

Per-task credential injection adds overhead. If every agent command requires fetching a fresh credential from a vault, the overhead accumulates. Teams that feel the friction find ways to cache. The cache becomes the new ambient credential. You have added process without reducing exposure.

Dry-run parity is not guaranteed. Terraform plan and apply use the same code path, mostly. Some Kubernetes operators have dry-run modes that differ from apply in subtle ways, producing plans that look safe and applies that break. Treat dry-run as evidence, not proof.

Staging and production drift. If your staging environment has different data volumes, different IAM configurations, and different third-party integrations than production, a successful staging run is weak evidence about production behavior. The agent that works correctly in staging and fails in production is common, not an edge case.

"Smallest credible blast radius" means smallest given the actual task requirements. An agent that does nothing has zero blast radius and zero value. The doctrine is a constraint on design, not a target to minimize to absurdity.

Three real incidents through the blast-radius lens

The Replit production database delete (mid-2024)

A Replit agent deleted a production database during a development session. The credential blast radius was full production database write access: the agent's connection string connected to the live database, not a development copy. The action blast radius was a DROP TABLE (or equivalent) without a confirmation step. No dry-run mode, no sandbox, no automatic backup at the moment of deletion.

A 90-second pause before destructive DDL would have surfaced the action for review. A separate credential for the development session, scoped only to the development database, would have made the production database unreachable. Either change alone caps the blast radius. Neither was in place.

Post-mortems from the incident circulated on Hacker News in mid-2024 and were covered in subsequent AI agent safety discussions.

The $82,000 GCP bill

A developer's agent, running with a GCP service account that had project-level editor permissions and no billing cap configured, entered a loop that provisioned GPU instances repeatedly. The credential blast radius was the full project: compute, storage, networking, all reachable with the same key. The action blast radius was "spin up expensive compute," which is not irreversible but accrues cost faster than any human review cycle.

A billing alert at $500 would have triggered before the bill reached four figures. A service account scoped to one Compute Engine zone with a quota policy would have capped the compute spend. The incident is real and circulates in cloud cost engineering discussions; the exact organization has not been publicly named.

Claude Code February 2026 credential leak

The Knostic disclosure in February 2026 found that Claude Code's settings.local.json captured environment variables from every agent session and wrote them to .claude/ in the project directory. When developers ran npm publish, the file shipped in the tarball. Knostic found it in roughly 1 in 13 npm packages they scanned.

The credential blast radius was every secret in the developer's shell: API keys for every provider, database connection strings, OAuth tokens. The process blast radius was the entire agent session. Nothing scoped credentials to individual tasks. The values that leaked were not "the key the agent needed for this task." They were the full ambient environment.

Anthropic patched the recording behavior in late February 2026. The files already on disk in repos that have not been opened since still hold whatever they captured.

Checklist for your next agent deployment

## Blast radius review

Credential scope
- [ ] Agent credential is the narrowest scope that covers the actual task
- [ ] Read-only where the task does not require writes
- [ ] Credential is project-scoped, not account-root
- [ ] Staging credential is distinct from production credential (different key, not just different config)
- [ ] Credential has an expiry or rotation schedule
- [ ] Credential is not inherited from the ambient shell; it is injected at task start

Process scope
- [ ] Agent receives credentials per-task, not per-session, where feasible
- [ ] Credentials are injected into child processes and do not persist to the filesystem
- [ ] Agent does not inherit the full developer shell environment

Action scope
- [ ] Destructive or apply-mode actions require an explicit flag or review step
- [ ] Agent can reach staging but not production without a separate explicit credential
- [ ] Irreversible actions (DROP, DELETE without soft-delete, public publish) are flagged before execution
- [ ] Cost-generating actions (compute provisioning, external API calls) have caps or quotas

Review triggers
- [ ] Any action with action blast radius > "one record, recoverable" gets a human checkpoint
- [ ] Billing alerts set below acceptable loss threshold
- [ ] Audit log records every credential grant with timestamp and task context

What this means for your stack

The three dimensions (credential scope, process scope, action reversibility) interact multiplicatively. An agent with a read-only credential, per-task process scope, and a dry-run default has near-zero blast radius on almost every task. An agent with an admin credential, per-user process scope, and apply defaults has the maximum blast radius on every task. Most real setups land somewhere in the middle and drift toward wider over time because wider is easier to configure.

The practical move is to audit your current agent setup against the checklist above and identify which dimension is widest. Fix that one first. You do not need to solve all three at once.

hasp is one working implementation of the process-scope and credential-scope controls. curl -fsSL https://gethasp.com/install.sh | sh, hasp setup, connect a project, and the next agent session receives a scoped reference injected at task start rather than the full ambient environment. Source-available (FCL-1.0), local-first, macOS and Linux, no account.

The doctrine holds regardless of tooling. Blast radius is determined upstream, at design time, not at incident time. By the time the agent makes the bad call, the damage ceiling is already set.

Sources· cited above, in one place

NEXT STEP~90 seconds

Stop handing the agent your real keys.

hasp keeps secrets in one local encrypted vault, brokers them into the child process at exec, and never lets the agent read the value.

  • Local, encrypted vault — no account, no cloud, no telemetry by default.
  • Brokered run — agent gets a reference, the child process gets the value.
  • Pre-commit + pre-push hooks catch managed values before they ship.
  • Append-only HMAC audit log answers "did the agent touch the prod token?" in seconds.
→ okvault unlocked · binding ./api
→ okgrant once · pid 88421
→ okagent never read

macOS & Linux. Source-available (FCL-1.0, converts to Apache 2.0). No account.

Browse all clusters· eight threads, one index