Audit an MCP server before you install itFive tests. Twenty minutes. Real teeth.
Adding an MCP server in 2026 looks like running `npm install`. The trust model is closer to running a binary your coworker emailed you. Twenty minutes of audit work pays for itself the first time a server tries something you did not expect.
-
01
Source
Read it
Unpack the npm tarball or clone the repo. Read every tool handler and every outbound HTTP call.
before any install -
02
Sandbox
Isolate
Temp dir, restricted user, network blocked by default. Boot the server cold and watch what it tries.
no real creds yet -
03
Trap
Trip wires
Fake OAuth token, an interactsh URL for callbacks, mitmproxy on every outbound socket. Run the listed tools.
scopes vs reality
TL;DR· the answer, in twenty seconds
What this is: A five-step audit you can run on any MCP server before you wire it into Claude Code, Cursor, or Codex CLI. Source read, isolated run, fake-credential test, traffic capture, scope diff.
The minimum fix: Never install an MCP server straight into your real client profile. Run it cold in a sandbox first, with fake credentials and an interactsh URL for any callback host, and inspect every outbound request.
The lesson: The MCP install UX is friendlier than the trust model warrants. Anyone can publish a server. The spec assumes you vetted it. Closing that gap is your job, not the client's.
The Model Context Protocol gave us a one-line install story and a trust model that has not caught up. claude mcp add is two words. The thing it installs runs in your shell, with your environment, and gets to call tools the client exposes to it. Treating that like npm install some-utility is how the bad days start.
This guide gives you a repeatable audit. Five steps, roughly twenty minutes for a small server, longer for a bloated one. Run it once per server, before you ever hand it a real token.
Why the audit is not optional
The MCP spec is explicit that the client trusts the server, the server trusts the client, and humans are the ones who break ties when the two disagree. Read the authorization section and you will see the word "MUST" attached to consent. Most of the install UX skips straight past that.
Three things changed the risk profile in 2026:
- The catalog grew. Public MCP server lists carry thousands of entries. Anyone can publish. There is no central reviewer.
- The clients got friendlier. Claude Code, Cursor, and Codex CLI all ship one-command install flows that wire a new server into your profile in seconds.
- Real incidents shipped. Snyk and others have published research on supply-chain risk in MCP servers (prompt injection inside tool descriptions, malicious callback URLs, tools that request more than they advertise without telling you).
If you would not run a random npm package as root on your laptop, do not install a random MCP server into the same profile that holds your AWS keys.
Step 1: read the source
For an npm-distributed server, do not install. Fetch the tarball and unpack it:
npm pack @vendor/mcp-server-name
tar -xzf vendor-mcp-server-name-*.tgz
cd package
For a repo-based server, clone and check out the exact tag the docs tell you to install:
git clone https://github.com/vendor/mcp-server-name.git
cd mcp-server-name
git checkout v1.4.2
Now read. You are looking for five things:
- Every
toolhandler. Map each one to what it actually does. A tool calledsearch_notesthat opens a network socket deserves a second pass. - Every outbound HTTP call. Grep for
fetch(,axios,got,requests.,http.Client, whatever the language uses. Note every host. - Every environment read.
process.env,os.environ,std::env::var. The server should declare its config keys in the README. Anything it reads beyond that is interesting. - Every filesystem write.
fs.write,open(..., "w"). Servers should not be persisting state under your home dir unless the README says so. - Every place an LLM string lands. Tool descriptions, parameter schemas, error messages. Anything that gets sent back to the client model is a prompt-injection surface.
You do not need to read every line. You need to read enough to know what the server claims to do and confirm the code only does that.
If the package is a compiled binary with no source, that is your audit result. Stop. Pick a different server.
Step 2: run it isolated
Boot the server in a sandbox before it touches your real profile. The setup looks different on macOS and Linux, but the goal is the same: separate user, separate home, no inherited environment, network blocked by default.
On Linux, firejail does this in one command:
firejail \
--noprofile \
--private \
--private-tmp \
--net=none \
--env=PATH=/usr/bin \
node ./dist/index.js
On macOS, use a fresh user account or run it inside a container. docker run --rm -it --network=none --user=1000:1000 -v "$PWD:/app" -w /app node:22 node ./dist/index.js works for most node servers.
Watch what happens. A well-behaved server prints a startup banner, listens on stdio, and waits. A server that complains at boot about not reaching a host is telling you something useful.
Now drop the --net=none and try again, but route the traffic through a logger (next step). The point of this round is to learn which hosts the server contacts at boot, before any tool call.
Step 3: hand it a fake credential
The audit gets sharper here. Every credential the server will accept should be set to a marker value you can spot later in logs and outbound traffic.
For an OAuth-based server, generate a fake bearer token that follows the right shape but is not valid:
export PROVIDER_TOKEN="ghp_$(openssl rand -hex 20)"
export PROVIDER_API_KEY="sk-test-$(openssl rand -hex 12)"
For callback URLs, register an interactsh hostname and feed it as the callback target:
interactsh-client -v
# copy the hostname it prints, then:
export PROVIDER_CALLBACK_URL="https://<your-interactsh-id>.oast.fun/callback"
Now run the listed tools, one at a time, against the sandboxed server. Drive them with the MCP inspector or a small script that opens a stdio session and posts tool calls.
You are watching for three behaviors:
- The server uses the fake token only against the host its README named.
- The server does not POST your token, your callback, or any env var to a host you did not authorize.
- Any interactsh callback shows up on the URL the server told you it would use, not somewhere else.
The classic confused-deputy pattern (a proxy server that forwards your token to a third-party API without checking the audience claim) shows up in this step. We wrote about it yesterday. The audit is how you catch it before it matters.
Step 4: capture every outbound request
Read-the-source catches the obvious. Traffic capture catches what the code does in practice, including anything dynamic.
mitmproxy in front of the sandboxed server is the cleanest setup:
mitmweb --listen-host 127.0.0.1 --listen-port 8080 &
HTTPS_PROXY=http://127.0.0.1:8080 \
HTTP_PROXY=http://127.0.0.1:8080 \
NODE_TLS_REJECT_UNAUTHORIZED=0 \
node ./dist/index.js
Drive the tools again. The mitmweb UI on port 8081 shows every request, including headers and body. Look for:
- Hosts not in your readme review
- Tokens or env vars in request bodies, query strings, or headers
- Calls fired before any tool was invoked (these often signal telemetry the README did not disclose)
- Suspicious user-agent strings or custom headers that look like fingerprinting
For servers that pin their CA and refuse to proxy, tshark on a loopback interface or a network namespace with tcpdump -A will at least show you the hosts and TLS SNI, even if the bodies are encrypted. That is enough to spot a server that calls home.
If you find one undeclared call, that is a question to ask the vendor. If you find five, you have your audit result.
Step 5: diff the scopes
The last step matters more than people give it credit for. Every credential the server accepts should have a stated minimum scope. The audit confirms the server only uses that scope.
For an OAuth server, issue a test token with the minimum scopes the docs claim:
gh auth refresh --scopes "repo:status,read:user"
Now drive the server's full tool catalog against that token. If any tool fails with a "scope insufficient" error, the docs were wrong, or the tool is doing more than it advertised. Either way, you found out before production.
For an API-key-based server, the same shape works with the provider's read-only key, scoped key, or limited-permissions service account. Run the catalog. Note every permission denied. Decide whether you are willing to grant the broader scope the server actually needs.
A useful artifact: write your audit results to a markdown file in the repo where you keep your team's MCP allowlist. One row per server, with the date you audited, the version, the source-review notes, the outbound hosts you observed, and the minimum scope you confirmed. The first time a server updates and someone wants to bump the pin, the diff is right there.
What to do with a server that fails
A failure does not have to be fatal. Three responses, in order of preference:
- Fork it. If the offending behavior is one undeclared telemetry call, you can patch it out and pin your fork. The MCP server surface is small enough that this is feasible.
- Wrap it. Put a thin proxy in front that strips outbound calls you did not allow. Slower than running the upstream directly, but real.
- Drop it. Most servers have alternatives. The friction of finding one is lower than the friction of explaining a leak later.
The audit also tells you when to stop trusting a server you already approved. If a routine upgrade introduces a new outbound host, that is a re-audit, not a claude mcp upgrade.
What this means for your stack
The minimum action is to stop installing MCP servers straight into your real client profile. Pull the package, read it, run it cold in a sandbox with fake credentials, capture every outbound request, and confirm the scopes match the docs. Twenty minutes is a small price for the questions you do not have to answer at 3 a.m.
The architectural pattern that closes the rest of the gap is a runtime broker between your real credentials and any tool that asks for them. The broker holds the value, hands the process a reference, and records every grant in an append-only log. An MCP server that gets a reference instead of a value cannot leak what it never saw.
hasp is one working implementation. curl -fsSL https://gethasp.com/install.sh | sh, hasp setup, then bind the project where your agent runs MCP servers. The next time you wire one in, the server gets a scoped reference and the audit log answers "did that server read the prod token" in seconds. Source-available (FCL-1.0), local-first, macOS and Linux, no account.
The audit is the part that holds whether or not you ever install a broker. If your team treats MCP installs the way you treat new dependencies, with a five-minute review and a paper trail, most of the bad days never reach you. The trust model the spec assumes is the one you have to enforce. Nobody else is going to.
Sources· cited above, in one place
- Model Context Protocol Specification
- MCP docs Server and client implementation guides
- Snyk Security Labs MCP prompt-injection and supply-chain research
- Anthropic security Vulnerability disclosure and Trust Center
- ProjectDiscovery Interactsh Out-of-band interaction detection
- TruffleHog Secret scanner
Stop handing the agent your real keys.
hasp keeps secrets in one local encrypted vault, brokers them into the child process at exec, and never lets the agent read the value.
- Local, encrypted vault — no account, no cloud, no telemetry by default.
- Brokered run — agent gets a reference, the child process gets the value.
- Pre-commit + pre-push hooks catch managed values before they ship.
- Append-only HMAC audit log answers "did the agent touch the prod token?" in seconds.
macOS & Linux. Source-available (FCL-1.0, converts to Apache 2.0). No account.