GUIDE · COMPLIANCE 10 min ·

NIST AI security, translated for devsWhat the frameworks mean when agents are in the stack.

NIST published three documents that now cover AI security. None of them were written with autonomous agents in mind. That gap is your problem to close.

TL;DR· the answer, in twenty seconds

What: NIST AI RMF (AI 100-1), SSDF (SP 800-218), and the adversarial ML taxonomy (AI 100-2 E2023) are the three documents compliance teams reference. Each was written for models, not agents, and none covers the runtime layer your agent actually runs in.

Fix: Map MEASURE to audit logs on every tool call, MANAGE to explicit scopes and sandboxed execution, SSDF PO.5 to AI-generated code review gates, and AI 100-2 threat categories to your MCP server trust model.

Lesson: The frameworks are generic enough to extend to agents if you do the translation work yourself. The controls are sound. The framing is just ten years behind where the technology landed.

Your compliance team forwards a PDF. Subject line: "Action needed re: AI security posture." The attachment is NIST AI 100-1, a 64-page risk management framework published in January 2023. A second email arrives the same afternoon with SP 800-218, the Secure Software Development Framework. Both documents use the word "model" where you would say "agent."

This is the translation problem. NIST built these frameworks for AI as a product: a trained model with a fixed input/output interface, evaluated by a vendor, deployed as a static artifact. They did not anticipate a runtime that spawns subprocesses, calls external tools over MCP, reads your filesystem, writes to your codebase, and takes actions that are hard to reverse. The controls are real and enforceable. The framing just lags.

The gap matters in both directions. Dismissing NIST as irrelevant to agents is wrong: the underlying risk principles hold. Applying the controls naively is also wrong, because the agent runtime exposes surfaces the documents never mention. Work through each document and extend the controls to fit what you actually shipped.

What to know in 60 seconds

  • NIST AI RMF gives you four functions: GOVERN, MAP, MEASURE, MANAGE. MEASURE and MANAGE translate directly to agent observability and capability bounding.
  • NIST SSDF (SP 800-218) is a secure software development framework. Two practices, PO.5 and PW.4, apply cleanly to AI-generated code pipelines.
  • NIST AI 100-2 E2023 is the adversarial ML taxonomy. The 2026 update added prompt injection and agent-tool-use categories explicitly.
  • All three documents treat "AI" as "the model." None covers the broker layer, MCP servers, or agent persistence between sessions.
  • The translation work is yours. NIST will not catch up to autonomous agents before your next compliance review.

Three NIST docs you will get handed

AI 100-1 (January 2023). The AI Risk Management Framework. Four functions: GOVERN (policy and accountability), MAP (context and risk identification), MEASURE (analysis and evaluation), MANAGE (response and mitigation). Not a checklist. A vocabulary and a process. Your compliance team will ask you to show that your team does something that maps to each function.

SP 800-218 (February 2022). The Secure Software Development Framework. Predates the current AI coding wave by three years, so every reference to "software" in it now implicitly includes software written by an agent. Four groups: Prepare the Organization (PO), Protect the Software (PS), Produce Well-Secured Software (PW), Respond to Vulnerabilities (RV). PO and PW are the ones with agent-specific bite.

AI 100-2 E2023 (March 2023, updated 2026). Adversarial Machine Learning: A Taxonomy and Terminology. The 2026 update added sections covering prompt injection and agent-tool-use attack patterns. It now maps onto the threat model for any agent that accepts external input and calls tools.

Map each to dev work

AI RMF MEASURE: instrument every tool call

NIST defines MEASURE as "employing quantitative, qualitative, or mixed-method tools, techniques, and methodologies to analyze, assess, benchmark, and monitor AI risk." In model terms that means accuracy metrics and drift detection. For an agent, it means audit logs.

An agent that calls bash, reads from disk, writes to a config file, or issues an API call is taking an action. If you cannot answer "what did the agent do between 2pm and 4pm on Tuesday, and in what order," you are not doing MEASURE. The MEASURE function in the AI RMF requires you to be able to analyze and monitor. That requires a log.

What a compliant MEASURE implementation looks like for agents:

  • Every tool call logged with timestamp, tool name, inputs, and caller context.
  • Every credential access (if any) logged with the same granularity.
  • Logs written to an append-only sink the agent process cannot delete.
  • A query path so a human can reconstruct the session in under five minutes.

opentelemetry-sdk covers the instrumentation side for most runtimes. The gap is almost always upstream: nobody defined what "monitor AI risk" means concretely for their deployment, so nothing got instrumented.

AI RMF MANAGE: bound what the agent can do

MANAGE covers "responses to, and recovery from, identified risks." For a model, that might mean a rollback to an earlier version or a rate limit. For an agent, it means the agent cannot take actions you have not authorized in advance.

The Replit incident from mid-2024, where an agent deleted a production database, is the clean example. The agent had full database access because the developer gave it the same connection string they used locally. Nothing in the runtime said "you may read but not drop tables." MANAGE, applied to agents, requires that boundary to exist before the agent runs.

Four controls that satisfy MANAGE:

Scopes at grant time. When the agent receives a credential or a tool permission, the grant specifies what actions are allowed. Read-only access is a scope. Write-to-a-specific-path is a scope. "Full access" is not a scope, it is the absence of one.

Dry-run mode for destructive operations. Any tool call that writes, deletes, or deploys should have a dry-run path you can enable in review. The agent shows you the diff before it executes.

Session ceilings. Cap how long a credential is valid and how many tool calls an agent can make per session. Ceiling controls do not substitute for scopes; they back them up.

Rollback targets. If the agent's action touches infrastructure or data, document the rollback procedure before the session starts. Writing this down takes five minutes. Using it at 2am takes thirty seconds you will not have otherwise.

Compliance teams want to see evidence that you have MANAGE controls, not just a policy statement. "We use dry-run mode for database migrations" is evidence. "We take AI security seriously" is not.

SSDF PO.5: AI-generated code needs the same gates as human code

PO.5 is "Define and Use Criteria for Security Checks." The practice requires that you define when security checks happen and what they check, and that you apply those checks consistently to all code before it ships.

The gap: most teams apply this practice to human-written code and implicitly exempt AI-generated code because it "came from the model." That exemption is not in the practice. If your CI pipeline runs SAST on PRs, it should run on AI-generated PRs. If you require a security review before merging to main, that requirement applies to an agent's commit.

GitGuardian's 2026 State of Secrets Sprawl found that AI-assisted commits leak secrets at roughly twice the rate of baseline. Part of that gap is that review gates treat AI output differently. PO.5 says they should not.

Concretely: add a rule to your branch protection that applies to all authors, including the agent's bot account. If the agent opens PRs, those PRs go through SAST, secret scanning, and whatever human review your policy requires. Do not give the agent a fast-path to merge.

SSDF PW.4: the agent's training data is someone else's code

PW.4 is "Reuse Existing Code Safely." The practice covers vetting third-party and open source code before integrating it, checking licenses, and tracking provenance.

When an agent generates code, it draws from training data. You do not know which code in that data influenced the output. You do not know the license. You do not know whether the pattern the agent reproduced came from a copyleft project, a restrictively licensed commercial project, or something with no license at all.

This is not a reason to avoid AI-generated code. It is a reason to apply PW.4 to it: run license-aware static analysis tools (Fossology, FOSSA, Snyk's license checks) on AI-generated files before they go into a product that ships to customers. Most teams do this for node_modules. Very few do it for src/ files an agent wrote.

The compliance answer to "how do you apply PW.4 to AI-generated code?" is: you treat it the same as code from any external source. You scan it. You track it. You flag anything with a license that conflicts with your distribution model.

AI 100-2 E2023: three threat categories that now have names

The 2026 update to AI 100-2 explicitly names three threat classes relevant to agent deployments.

Indirect prompt injection. The attack inserts instructions into content the agent reads, not into the user's prompt. An agent that summarizes emails, reads files from a shared drive, or fetches web pages is reading attacker-controlled content. The content tells the agent to take a different action. The MCP GitHub prompt-injection data heist documented by Snyk researchers in early 2026 is the concrete example: malicious content in a repository caused an agent to exfiltrate data through tool calls the user never authorized.

The control: treat all content the agent reads from external sources as untrusted input. Apply the same skepticism you apply to SQL query parameters. The agent should not act on instructions embedded in content it fetches from outside the session.

Data poisoning. Training data attacks corrupt the model's behavior at a statistical level. For an enterprise deploying a fine-tuned model on proprietary data, this means the fine-tuning dataset is an attack surface. You need the same provenance controls on your training data that you apply to your dependency tree.

Evasion (jailbreaks). Users or attackers craft inputs that cause the model to ignore its safety constraints. For most development teams, the mitigation is not to rely on model-level safety constraints as your primary access control. The model saying "no" is a soft boundary. Capability bounds set in the runtime are a hard boundary.

Where NIST has not caught up

NIST's model of AI security assumes one discrete thing: the trained model. The risk surface is the model's behavior, measured against its training goals. Controls live at training time, eval time, and inference time.

Agent deployments add a layer NIST does not name: the runtime broker. The broker is the process that routes the agent's tool calls, holds credentials, manages session state, and decides what the agent can touch. In a typical Claude Code or Codex setup, that broker is informal, implicit, and often just "your shell environment." Credentials sit in environment variables. Sessions persist state to disk. Tool calls go to whatever the agent decides to call.

AI 100-1 does not tell you to audit the broker. AI 100-2 does not have a threat category for "agent reads an MCP server that was published without signature verification." OX Security reported roughly 7,000 MCP servers in the wild with around 150 million downloads in early 2026, with no signature requirement. That is a supply chain surface the NIST taxonomy does not have words for yet.

The SSDF covers supply chain risks for software components (practice PS.3), but it is not written to handle the case where the agent dynamically loads a tool server at runtime from an unauthenticated source.

You will not find controls in NIST that say "verify the MCP server's identity before the agent calls it." You will not find guidance on the agent's session state file and whether it captures environment variables. Those gaps exist because the documents were written before the runtime they now nominally cover.

The thing NIST gets right that most devs skip

The AI RMF GOVERN function requires that someone in your organization own the AI risk posture and that there is a documented accountability chain. Most teams ship an AI-assisted product without ever writing down who is responsible if the agent takes a damaging action.

This sounds like bureaucracy. It determines your incident response timeline. When the agent deletes the wrong data or exfiltrates credentials, "who owns this" needs an answer in under thirty seconds. If you have not written that down, the GOVERN function is the first control you are missing, not the last.

Fill in the accountability chain before the compliance review asks. You do not want it to be a blank form at 11pm during an incident.

A mapping you can hand to a compliance team

NIST Control            Agent-specific implementation
-----------             ----------------------------
AI RMF GOVERN           Named owner, incident response contact, documented agent scope
AI RMF MAP              Threat model updated to include tool-call surface and MCP servers
AI RMF MEASURE          Append-only audit log: every tool call, timestamp, inputs, caller
AI RMF MANAGE           Explicit scopes per grant, dry-run mode, session ceiling, rollback plan
SSDF PO.5               AI-generated code goes through same SAST + review gates as human code
SSDF PW.4               License scan runs on AI-generated source files before ship
SSDF PS.3               MCP server provenance check before agent loads it at runtime
AI 100-2 Injection      External content treated as untrusted; agent cannot act on embedded instructions
AI 100-2 Poisoning      Training / fine-tuning data has provenance controls
AI 100-2 Evasion        Model safety constraints backed by runtime capability bounds, not relied on alone

Hand this table to the compliance team with the specific implementation notes filled in. "We use dry-run mode for database writes, enabled via the --dry-run flag in our deployment script" is the kind of specificity that closes a finding. A policy statement does not.

What this means for your stack

The controls above are implementable without vendor-specific tooling. Append-only logs, scoped credentials, branch protection that applies to bot accounts: all of these run on what you already have. The harder part is the audit trail for credential access at the agent runtime layer, because most tools do not log that at the granularity a MEASURE compliance review expects.

hasp is one working implementation of that audit layer. curl -fsSL https://gethasp.com/install.sh | sh, hasp setup, connect a project, and every credential access the agent makes goes into an HMAC-chained audit log at ~/.hasp/audit.jsonl, verifiable with hasp audit --verify. Scopes are set at grant time with a 24-hour ceiling. The agent gets a reference, not the value. Source-available (FCL-1.0), local-first, macOS and Linux, no account.

The durable takeaway is framework-agnostic. NIST's controls are sound. The translation from "model" to "agent" is work you have to do, and the runtime broker layer is where the translation gets hardest. Map the controls to the actual execution surface your agent touches, not to the surface the document imagined.

Sources· cited above, in one place

NEXT STEP~90 seconds

Stop handing the agent your real keys.

hasp keeps secrets in one local encrypted vault, brokers them into the child process at exec, and never lets the agent read the value.

  • Local, encrypted vault — no account, no cloud, no telemetry by default.
  • Brokered run — agent gets a reference, the child process gets the value.
  • Pre-commit + pre-push hooks catch managed values before they ship.
  • Append-only HMAC audit log answers "did the agent touch the prod token?" in seconds.
→ okvault unlocked · binding ./api
→ okgrant once · pid 88421
→ okagent never read

macOS & Linux. Source-available (FCL-1.0, converts to Apache 2.0). No account.

Browse all clusters· eight threads, one index