MCP's trust model has sharp edgesWhat every MCP user inherits. What to do about it.
MCP shipped a useful protocol. It also shipped four trust assumptions that attackers can use directly. Seven thousand servers are in the wild. None require a signature.
-
01
Trust
Raw secrets in config
env vars and connection strings passed as-is
by spec design -
02
Vector
Any server reads them
no isolation between MCP server processes
all installed servers -
03
Impact
7,000+ servers, 0 sigs
OX Security: 150M downloads, no signature req
early 2026
TL;DR· the answer, in twenty seconds
What: The MCP specification (through v0.6) passes raw environment variables and connection strings to servers as configuration, allows servers to poison tool descriptions in flight, and includes a sampling feature that lets servers request LLM completions back through the client (a ready-made data exfiltration channel). No signature requirement exists for server distribution.
Minimum steps: Audit every installed MCP server's source before trusting it with a credential-bearing environment. Do not install from unsigned or unreviewed sources. Keep MCP server configurations in a separate directory outside your repo. Treat every tool description as potentially attacker-controlled.
Lesson: Any protocol that passes configuration as ambient state and delegates trust decisions to the user will produce credential exposure at scale. Explicit, time-scoped grants beat ambient access by default.
In early 2026 OX Security counted roughly 7,000 MCP servers in the wild with around 150 million cumulative downloads. None of them require a cryptographic signature to install. That is not a scandal. It is the predictable outcome of shipping a v1 protocol that trusted the ecosystem to self-police.
Anthropic built MCP to solve a real problem: LLM clients need a standard way to call external tools. The spec is thoughtful in places. It is also a product of the "ship it and iterate" philosophy, which means the first version carries assumptions that look reasonable at low adoption and look dangerous at 7,000 servers.
This article is not a complaint about Anthropic. It is a read of what the MCP spec actually specifies, where those specifications produce attack surface, and what you inherit when you install an MCP server today.
What to know in 60 seconds
- MCP server configuration passes raw environment variables and connection strings to server processes. Any installed server can read them.
- Tool descriptions are returned dynamically by servers. A compromised or malicious server can return a poisoned description that manipulates the LLM's behavior.
- The
samplingcapability lets MCP servers request LLM completions back through the client. That reverse channel can carry data out of the conversation. - Server distribution has no signature requirement. There is no equivalent of npm's provenance or PyPI's Trusted Publisher for MCP.
- These are protocol-level design choices, not implementation bugs. Patching one server does not close them.
The four trust assumptions in the spec
Configuration delivers raw secrets
The MCP spec specifies that servers receive their configuration via environment variables and connection strings set at startup. The transport guidance reads, in substance:
Servers SHOULD accept configuration through environment variables, command-line arguments, or both.
There is no sandboxing primitive in the spec. When you configure an MCP server with DATABASE_URL=postgres://user:pass@host/db, that string lives in the server process's environment for the lifetime of the session. If you install multiple MCP servers (the common case in Claude Code), they share the same shell environment by default. Every server sees every variable.
That is not a coding mistake in a particular server. It is the transport the spec recommends.
Compare the failure mode from the February 2026 Claude Code settings.local.json incident: Knostic found environment variables captured to disk in roughly 1 in 13 npm packages. The MCP version of the same failure does not require a state file. The server just reads os.environ at startup.
Tool descriptions are attacker-controlled text
MCP clients call tools/list on each server and receive a list of tool names and descriptions. The client, or the LLM reasoning about which tool to call, reads those descriptions as trusted guidance. The spec requires that descriptions be strings. It does not require that they be static, bounded in length, or vetted.
A server that wants to manipulate the model's behavior can return any description it chooses. The MCP GitHub prompt-injection incident documented by Snyk researchers in early 2026 demonstrated this concretely: a GitHub MCP server processed a repository that had planted instructions in a README. Those instructions surfaced in tool outputs and were treated as LLM instructions downstream. The tool description vector is similar but more direct: the malicious text arrives in the tools/list response before any file is read.
Conventional prompt injection requires the attacker to get text into the model's context. Tool description poisoning works because the attacker controls the server, which the client treats as trusted by default.
Sampling creates a reverse data channel
The MCP spec defines sampling: the ability for MCP servers to send sampling/createMessage requests back through the client to the LLM. The intended use case is agentic servers that need to reason mid-task.
The spec's framing is that servers can request LLM sampling through clients, and the client decides whether to honor sampling requests.
The key word is "decides." The spec leaves that decision to the client implementation. Claude Code's implementation, as of this writing, prompts the user before honoring a sampling request. Not all clients do. And even when the user is prompted, the prompt typically shows the server name, not the full content of what the server is asking the LLM to process.
The data exfiltration scenario: a malicious server asks the client to send a completion request containing whatever files or environment data the server has access to. The completion response comes back through the client. The server logs it. The user saw a dialog that said "MCP server X wants to run a query."
This is not theoretical. It is an architectural property of sampling as specified. The channel exists whether or not any specific server exploits it.
No signature requirement on distribution
OX Security's early 2026 analysis put the population at around 7,000 MCP servers with roughly 150 million total downloads. Their finding on signatures: there is no signing requirement. A server on npm or PyPI is indistinguishable from a first-party Anthropic server at the protocol level.
The npm ecosystem added provenance attestation in 2023 and Trusted Publisher support in 2024. PyPI launched Trusted Publisher in 2023. Neither of these is mandatory, but the tooling exists and adoption is growing. MCP has no equivalent. A server called mcp-postgres-official in an npm search result gets the same protocol treatment as @anthropic-ai/mcp-server-filesystem.
This matters because the MCP trust model is transitive. When you add an MCP server, you give it access to everything in the configuration block you hand it, plus anything it can read from the shared environment. A typosquat that replaces a legitimate server gets that access without breaking the protocol.
Where the spec gets the trust model right
The MCP spec is not uniformly permissive. A few design decisions show awareness of the risk:
The spec requires explicit user consent before a client exposes resource types to a server. The roots negotiation (where the client tells the server which filesystem paths it can access) is a reasonable sandboxing primitive. It does not cover environment variables, but it does limit filesystem scope.
The spec also distinguishes between client-initiated and server-initiated flows at the transport level. A server cannot call a client tool uninstructed; it can only respond to requests and send sampling requests. That is a meaningful constraint compared to unrestricted bidirectional RPCs.
These are real mitigations. They do not close the four vectors above, but they show the spec was not written without security thinking.
What actually changes your risk posture
The four vectors are protocol-level. You cannot patch them without either patching your client or changing how you configure servers. Here is what changes the picture:
Start by isolating server environments. Do not install MCP servers into the same shell that holds your production secrets. Use a wrapper script that exports only the variables each server needs:
#!/bin/sh
# mcp-postgres-launcher.sh
exec env -i \
PATH="$PATH" \
DATABASE_URL="$MCP_POSTGRES_URL" \
mcp-server-postgres "$@"
Configure the server's launch command to call the wrapper, not the binary directly. The server gets one credential. It does not inherit your STRIPE_KEY, OPENAI_API_KEY, or AWS_SECRET_ACCESS_KEY.
A broker like hasp automates this at the session level. The MCP server receives a scoped reference rather than a raw environment variable, so the first design flaw (any installed server reading the full shared environment) no longer applies. The server cannot read what is not there.
Treat tool descriptions as attacker-controlled text. Before approving any tool call the model proposes, read the tool name and check it against what the server is supposed to do. MCP clients that show a "proposed tool call" dialog before execution give you a review window. Clients that auto-approve do not.
Disable sampling on servers that do not need it. If your MCP client supports per-server capability restrictions, turn sampling off for servers whose job is tool execution, not reasoning. A filesystem server does not need to ask the LLM anything.
Read the source before installing. There is no signed package to trust. For servers with no readable source, treat installation as equivalent to running an arbitrary binary with your shell environment.
Pin server versions. npm install mcp-server-x without a pinned version follows the latest tag. A supply chain attack that publishes a new patch version gets picked up on the next install. Pin with exact versions in package-lock.json or the equivalent for your package manager.
# Pin in package.json
npm install --save-exact mcp-server-postgres@2.1.4
What gets missed when people talk about MCP security
The conversation tends to focus on prompt injection through tools: a malicious document in a repository poisons the model's context. That is a real vector. It gets attention because it is novel and involves the LLM itself.
The credential theft vectors are quieter. An MCP server that reads process.env at startup and sends the contents to a remote endpoint does not involve any LLM reasoning. It does not trigger a prompt injection detection. It looks like normal server startup. The ambient secrets model is the problem, not prompt injection specifically.
The other missed point: client diversity matters more than server diversity. The MCP spec permits client implementations to add guardrails: sampling consent dialogs, tool description display, capability restrictions. Claude Code's implementation differs from Cursor's implementation differs from a minimal client built with the SDK. The spec does not mandate these guardrails. When you install a new MCP client, you should check which guardrails it implements, not only which servers it supports.
A checklist for your current MCP setup
MCP server audit checklist
- [ ] List all installed MCP servers and their source repos
- [ ] Confirm source is readable for each (no binary-only distributions)
- [ ] Confirm versions are pinned in package.json / requirements.txt
- [ ] Check each server's launch config: does it inherit the full shell env?
- [ ] Create per-server wrapper scripts that export only the needed variables
- [ ] Confirm your MCP client shows tool descriptions before approval
- [ ] Confirm your MCP client prompts for sampling consent (not auto-approve)
- [ ] Disable sampling capability for servers that do not need it
- [ ] Audit server update cadence: know what changed before upgrading
- [ ] Remove any MCP server you no longer actively use
Run this any time you add a new server or upgrade your MCP client.
What this means for your stack
The MCP trust model is ambient-by-default: secrets live in the environment, servers read what they can reach, sampling sends data through the client without content inspection. That model works at small scale when you know every server you install. It does not work at 7,000 servers with 150 million downloads and no signature requirement.
The fix is not to stop using MCP. It is to stop treating MCP server configurations as inheriting your full shell environment by default. Each server should receive exactly the credentials it needs, injected at exec time, revoked when the session ends. An audit record of what each server was handed, and when, closes the loop.
hasp is one working implementation of that model. curl -fsSL https://gethasp.com/install.sh | sh, hasp setup, connect a project, and the next MCP session receives a scoped reference instead of a raw credential. Source-available (FCL-1.0), local-first, macOS and Linux, no account.
The spec will improve. Version 1 protocols always do. Until the signing requirement exists and the isolation primitives are mandatory, you are the enforcement layer. Treat every MCP server like a third-party binary that reads your environment, because that is exactly what it is.
Sources· cited above, in one place
- OX Security AppSec research, including MCP ecosystem analysis
- Snyk Security Labs MCP prompt-injection and supply-chain research
- Knostic Research on AI code editor secret leakage (Claude Code, Cursor)
- Anthropic Security advisories and Claude Code release notes
- Model Context Protocol Specification
- Functional Source License FCL-1.0 text
Stop handing the agent your real keys.
hasp keeps secrets in one local encrypted vault, brokers them into the child process at exec, and never lets the agent read the value.
- Local, encrypted vault — no account, no cloud, no telemetry by default.
- Brokered run — agent gets a reference, the child process gets the value.
- Pre-commit + pre-push hooks catch managed values before they ship.
- Append-only HMAC audit log answers "did the agent touch the prod token?" in seconds.
macOS & Linux. Source-available (FCL-1.0, converts to Apache 2.0). No account.