GUIDE · CONCEPT 9 min · PUB JUNE 11, 2026

A stolen agent token should be uselessBind it, do not just shorten it

Your coding agent's credentials are bearer tokens, and whoever holds the bytes holds the access. Sender-constrained tokens fix the replay half of that problem: the token is bound to a key the agent proves it holds on every call, so a copy lifted from a log or a poisoned package is worthless without the key.

ForEngineers issuing API tokens to AI coding agents who already rotate and scope them but still hand out plain bearer credentials

HASP CONCEPT · FLOW

01
Bearer
Just bytes

Copy the token, replay it anywhere
no holder check
02
Bind
cnf / jkt

Token tied to a key the agent holds
RFC 9449 / 8705
03
Replay
401 inert

Stolen copy fails without the key
proof per request

TL;DR· the answer, in twenty seconds

What happened: The 2026 standards wave for agent identity, NIST's NCCoE concept paper in February and the IETF's draft-klrc-aiagent-auth in March, converged on one fix for credential theft: bind the token to a key the agent holds, so a stolen copy cannot be replayed.

The minimum fix: For any token your agent sends to an HTTP API, turn on sender-constraining (mTLS-bound per RFC 8705 or DPoP per RFC 9449). For raw secrets with no token endpoint, broker them into the process at runtime instead of exporting them as environment variables.

The lesson: A credential has three properties: how long it lives, what it can do, and whether a copy still works. The corpus already beat lifetime and scope to death. Binding is the leg that turns a leaked token into a non-event.

Most of the credentials your coding agent holds are bearer tokens, and a bearer token is bytes. Whoever copies the bytes gets the access. That is the whole security model of an Authorization: Bearer header, and it is the reason a leaked key is an incident and not a typo.

The agent-secrets corpus has spent a year on two ways to shrink the damage. Make the token short-lived so the stolen copy expires. Make it least-privilege so the stolen copy can touch less. Both help. Neither changes the fact that inside the token's window, anyone holding the string is the principal. A third property fixes that, and almost nobody ships it: binding.

Three properties of a credential, and the one nobody ships

A credential answers three separate questions. How long is it good for. What is it allowed to do. And does a copy of it still work for whoever picked the copy up.

The first two are well-trodden. Workload identity federation mints a token that expires in an hour instead of a year, so a leak has a clock running against it. Authorizing the action, not the secret shrinks what the token can reach, so a leak buys less. Read those if you have not; they are the first two legs of the stool.

This is the third leg. A short-lived, least-privilege token is still a bearer token. Exfiltrate it during its sixty-minute life and you can replay it from your own laptop against the same API, with the same scope, until it expires or someone revokes it. Lifetime and scope bound the blast radius of a replay. They do not stop the replay. Binding does: it ties the token to a key the holder must prove control of on every request, so the copied string is dead weight without the key.

Why a bearer token loses the moment it leaks

The failure is structural, not a bug you can patch. An Authorization: Bearer sk_live_... header carries no proof of who is sending it. The resource server reads the bytes, checks they are valid and unexpired, and serves the request. It has no way to ask "are you the workload this was issued to," because the token says nothing about a holder. Possession is the entire claim.

Coding agents leak that string through plenty of channels. The agent loads it into an environment variable, where any tool call that runs printenv or reads /proc/self/environ can scoop it. It pastes it into a curl command in a shell the agent drives. It follows an instruction buried in a fetched web page or a code comment and posts the value to an attacker's endpoint, the confused-deputy problem SANS spent a whole post on. And the supply chain reaches in from the other side: on June 9 a worm called Miasma, a TeamPCP variant of Mini Shai-Hulud, shipped inside seventy-three packages distributed through GitHub, built to fire the moment an AI coding agent opened the project and to sweep whatever credentials it found.

Every one of those paths ends with the token's bytes somewhere they should not be. With a bearer token, that is the end of the story:

# A bearer token is portable. Lift the string from a log,
# a package, or a prompt-injected tool call, and replay it.
curl https://api.example.com/v1/charges \
  -H "Authorization: Bearer sk_live_51H4xExfiltrated..."
# 200 OK. The server only checked the bytes, and you have the bytes.

Shorten the token's life and you shrink the window. Scope it down and you shrink the reach. The replay still works inside those limits, because nothing in the request proves the sender is the agent the token belongs to.

The usual objection is that short-lived tokens already solve this. They do not. "Rotate every 90 days" is a policy about keys that sit unused, and it has nothing to say about a credential an attacker grabbed from a running agent and used within seconds. An autonomous agent makes a request every few seconds for an hour at a stretch, so its token is live and valuable for that whole stretch. The Miasma worm did not wait for a rotation window; it read the value and exfiltrated it in the same process where the agent was working. A sixty-minute token is a sixty-minute replay window. The only number that makes a stolen bearer token safe is zero seconds of usable life after it leaves the holder, and lifetime cannot get you there. Binding can.

Binding in practice: prove you hold the key

Sender-constrained tokens close that gap. The idea is old and the standards are stable; agents are the new reason to care. When the token is issued, it records a public key, and from then on the resource server accepts the token only from someone who can prove control of the matching private key on that request. Steal the token without the key and the proof fails. The bytes are inert.

Two mechanisms carry this in production. Certificate-bound access tokens (RFC 8705) tie the token to the client's mTLS certificate: the TLS handshake itself is the proof, and the token's cnf claim holds the certificate thumbprint. DPoP (RFC 9449) does it at the HTTP layer for clients that cannot run mTLS, by binding the token to a key and requiring a signed proof header per request.

The token carries the binding in a confirmation claim:

{
  "sub": "agent:build-runner",
  "scope": "charges:read",
  "cnf": { "jkt": "0ZcOCORZNYy-DWpqq30jZyJGHTN0d2HglBV3uiguA4I" }
}

The jkt is the thumbprint of the agent's public key. Every call then ships a fresh proof signed by the matching private key and bound to the method and URL it targets:

curl https://api.example.com/v1/charges \
  -H "Authorization: DPoP eyJhbGciOiJFUzI1NiIsInR5cCI6ImF0K2p3dCJ9..." \
  -H "DPoP: eyJ0eXAiOiJkcG9wK2p3dCIsImFsZyI6IkVTMjU2Iiwiandr..."
# The DPoP proof asserts htm=GET, htu=https://api.example.com/v1/charges,
# a unique jti, and a fresh iat, signed by the bound private key.

Now lift the access token out of a log and the replay fails. You can copy the access token, but you cannot sign a valid proof for it without the private key, and the server rejects the call with 401 invalid_token. The thing worth stealing is no longer the string in the header. It is the private key, and that key stays in the process: it never lands in an environment variable and never travels in a request body.

Delegation gets the same treatment. When one agent hands work to a sub-agent, OAuth token exchange (RFC 8693) lets the parent trade its broad token for a narrower one, rebound to the sub-agent's key, before passing anything down:

curl https://idp.example.com/token \
  -d grant_type=urn:ietf:params:oauth:grant-type:token-exchange \
  -d subject_token="$PARENT_TOKEN" \
  -d scope="charges:read" \
  -d audience="https://api.example.com"
# The sub-agent receives charges:read bound to its own key,
# not a copy of the parent's full grant.

The sub-agent gets only what it needs, bound to itself. A leak from the sub-agent does not expose the parent's authority, because the sub-agent never held it.

What the 2026 standards settled, and what they punted

The year's identity work landed here. Read the primary documents, not the launch posts.

NIST's NCCoE published a concept paper on February 5, Accelerating the Adoption of Software and AI Agent Identity and Authorization. Its premise is that agents should be first-class identities instead of anonymous automation running under a shared key. It leans on existing standards rather than new ones: OAuth 2.0/2.1, OpenID Connect, SPIFFE/SPIRE, zero-trust architecture from SP 800-207. It also points at NIST IR 8587, whose title is the whole problem statement for this article: protecting tokens and assertions from forgery, theft, and misuse. The paper names the part it could not close, multi-hop delegation, and leaves it open.

A month later the IETF got a draft on the table. draft-klrc-aiagent-auth-00 landed March 2, with authors from AWS, Zscaler, Ping Identity, OpenAI, and Defakto Security. It composes WIMSE workload identifiers, SPIFFE IDs, and OAuth into one framework, and it reaches for proof of possession, mTLS and DPoP, to bind agent tokens. The draft is Informational, not standards-track, and its honesty is the tell worth quoting: the Security Considerations section reads, in full, "TODO Security."

There is a companion draft worth knowing, WIMSE Applicability for AI Agents, which takes the workload-identity model and stretches it to fit how agents behave: instances that live for one task and vanish, and agents that spawn other agents mid-run. Those two facts are why a static key fits agents so badly. The identity has to be as short-lived and as composable as the workload, and a key minted once and reused forever is neither.

Read these documents together and the shape is clear. The industry has a working answer for "a stolen token should not be replayable." It does not yet have an agreed answer for "this token should not have been allowed to do that in the first place." Binding is close to settled. Authorization across a chain of agents is a live argument, with token attenuation drafts and delegation-splicing threads still in flight. Take the part that is ready. Do not wait for the part that is not.

Binding without a token endpoint

None of the OAuth machinery touches the biggest pile of secrets. Most of what a coding agent loads is not an OAuth token at all. It is a DATABASE_URL with the password inline, a STRIPE_SECRET_KEY, a GitHub PAT, a dozen third-party API keys. There is no token endpoint to call, no cnf claim to set, no DPoP proof to sign. These are the purest bearer credentials you own, and they live in plaintext in the agent's environment for the whole session.

The binding principle still applies; it just moves off the wire and onto the host. The on-disk version of "prove you hold the key" is "make the secret usable only inside the live process that needs it, and never let it exist as a portable string the agent can copy out." That is process-tree scoping: a broker injects the value into the child process at exec time, the agent's own context never sees the bytes, and a value lifted from a log or a tool call does not correspond to anything an attacker can use elsewhere. Pair it with an audit trail and you also get the answer to "which process used this, when," which a leaked environment variable can never give you.

The two halves rhyme. On the wire, you bind the token to a key so a copy is inert. On the host, you bind the secret to a process so a copy is inert. Same goal, same payoff: the thing an attacker can exfiltrate stops being the thing that grants access.

What this means for your stack

For any token your agent sends to an HTTP API, turn on sender-constraining. If the client can do mTLS, use certificate-bound tokens; if it cannot, use DPoP. Either way a token lifted from a log or a poisoned dependency stops working the moment it leaves the agent, because the request needs a key the leak did not carry. This is configuration, not research. The RFCs are stable and the libraries exist.

The pattern underneath is one you can apply past the tokens that have an OAuth flow: bind capability to the live workload, not to a portable string. A credential should make the holder prove they are the holder on every use, whether the proof is a DPoP signature on a request or a process boundary the secret never crosses. Lifetime and scope decide how bad a leak is. Binding decides whether the leak is usable at all.

hasp is one working implementation of that pattern for the secrets no token endpoint covers. It brokers each value into the process that needs it and keeps it out of the agent's context, so the string the agent could leak is not the string that grants access. Source-available (FCL-1.0), local-first, macOS and Linux, no account.

The test holds whichever mechanism you reach for. Assume the credential leaked an hour ago. If the answer to "can the person who has the copy use it" is no, you have bound it. If the answer is "only until it expires," you have shortened it, and shortening is not the same as binding. Most agent setups have done the second and called it done. The second standards wave of 2026 is telling you to do the first.

Sources· cited above, in one place

NEXT STEP~90 seconds

Stop handing the agent your real keys.

hasp keeps secrets in one local encrypted vault, brokers them into the child process at exec, and never lets the agent read the value.

Local, encrypted vault — no account, no cloud, no telemetry by default.
Brokered run — agent gets a reference, the child process gets the value.
Pre-commit + pre-push hooks catch managed values before they ship.
Append-only HMAC audit log answers "did the agent touch the prod token?" in seconds.

Install hasp Read the docs View on GitHub

→ okvault unlocked · binding ./api

→ okgrant once · pid 88421

→ okagent never read

macOS & Linux. Source-available (FCL-1.0, converts to Apache 2.0). No account.