GUIDE · INCIDENT 9 min ·

Comment and Control hijacks three AI coding agentsOne PR title. Claude Code, Gemini CLI, Copilot. No CVE.

A security-review bot that reads a pull request is reading attacker-controlled text. Aonan Guan showed that one crafted PR title, issue body, or hidden HTML comment was enough to make Claude Code Security Review, Google's Gemini CLI Action, and GitHub's Copilot Agent run shell commands and post their own API keys back to the thread. No CVE was assigned, and every vendor fix left the architecture in place.

TL;DR· the answer, in twenty seconds

What happened: A prompt injection placed in a GitHub PR title, issue body, or hidden HTML comment made three CI coding agents (Claude Code Security Review, Google's Gemini CLI Action, and GitHub's Copilot Agent) run shell commands and post their own API keys and tokens back into public comments, logs, or commits. Aonan Guan and two Johns Hopkins researchers disclosed the chain in April 2026. No CVE was assigned.

The minimum fix: Do not run a coding agent against untrusted PR or issue text while live secrets sit in the same runtime. Restrict the agent's shell tool (Anthropic now ships --disallowed-tools 'Bash(ps:*)'), drop GITHUB_TOKEN to read-only, and gate any agent that acts on fork PRs behind manual approval.

The lesson: This injection is not a bug a vendor can patch out. The agent is built to read the attacker's text. While execution tools and live credentials live in the same process that ingests untrusted input, the next comment field is the next payload.

A pull request is untrusted input, and three of the most-used AI coding agents treated it as a trusted prompt. In April 2026 the security engineer Aonan Guan, working with Johns Hopkins researchers Zhengyu Liu and Gavin Zhong, published Comment and Control: a set of prompt injections that made Claude Code Security Review, Google's Gemini CLI Action, and GitHub's Copilot Agent run shell commands and hand back their own API keys. The name plays on the command-and-control (C2) channel of a malware operation. The control channel here is a GitHub comment box.

There is no CVE. The reports went to vendors starting in October 2025. The fixes that shipped narrow the blast radius without changing the thing that made the attack work, and one vendor downgraded its own finding from Critical to "None" after paying a hundred-dollar bounty.

A bot that reads attacker text

The setup is the same for all three tools. You wire a coding agent into GitHub Actions so it reviews pull requests, triages issues, or fixes bugs on assignment. The agent needs two things to do that job: tools that act (a shell, git, the GitHub API) and credentials that authenticate (its own provider key, plus the workflow's GITHUB_TOKEN). Those tools and credentials live in the runner process. The agent's prompt is assembled from the event that triggered it, which means the PR title, the issue body, and the comments all flow into the model's context.

That is the failure point. To the model, the PR title and the system instructions are the same kind of thing: text in the context window. There is no boundary in the prompt that says "everything below this line came from a stranger and must not be obeyed." When a maintainer's instruction and an attacker's comment sit in the same buffer, the model has no reliable way to tell them apart. This is LLM01, prompt injection, the first entry on the OWASP list for LLM applications, and it has had that slot for two years.

Most prompt-injection demos end at "the model said something it should not have." Comment and Control ends at "the model read the credentials out of its own process and posted them where the attacker can read them." The difference is the tools and the secrets sharing the runtime.

Three agents, three payloads

Guan built a separate payload for each tool, shaped around how that tool folds untrusted text into its prompt.

Claude Code Security Review. This is Anthropic's GitHub Action that reviews diffs and posts findings as PR comments. Guan found the PR title went into the prompt with no sanitization. An attacker opens a pull request whose title breaks out of the surrounding instructions and tells the agent to run a command. The agent runs ps auxeww, which lists every process with its full environment, captures ANTHROPIC_API_KEY and GITHUB_TOKEN from the output, and posts the result back. It arrives formatted as a routine "security finding" in a PR comment or sits in the Actions log. The reviewer bot did the job of a reviewer bot. It reviewed, and it commented. The comment carried the keys.

Gemini CLI Action. Google's action places issue titles, bodies, and comments into the prompt under an "Additional Content" heading. Guan's payload opens a fresh fake heading right after it, a "Trusted Content Section," and writes new instructions there. Gemini reads the forged section as higher-authority context, concludes it is operating in a safe environment, and posts GEMINI_API_KEY as a public issue comment. The injection did not break Gemini's safety rules by force. It convinced the model the rules did not apply by impersonating the part of the prompt the model already trusted.

GitHub Copilot Agent. Copilot's SWE agent fixes issues when a user assigns one to it. Guan hid the payload in an invisible HTML comment inside the issue body, so a maintainer skimming the issue sees nothing unusual. When the agent picks up the issue, it runs ps auxeww | base64 and commits the encoded credentials to the branch it is working on. Copilot ships three runtime defenses: environment filtering that hides secrets from the agent, secret scanning on its output, and a network firewall. The ps read pulls the secret out of a sibling process the environment filter never covered. Base64 hides the result from a scanner tuned to match sk- prefixes. Committing to the working branch is not the outbound connection the firewall watches for. One process listing, and none of the three layers saw it go.

This GitHub Actions wiring is what puts the keys in reach. The workflow runs an agent on an event whose text an outsider controls:

# A pattern that hands attacker text to an agent holding live secrets.
on:
  pull_request_target:        # runs with repo secrets, even for fork PRs
  issue_comment:
    types: [created]

permissions:
  contents: write             # the agent can push
  pull-requests: write        # the agent can comment

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: some-vendor/coding-agent-action@v1
        env:
          PROVIDER_API_KEY: $
          # GITHUB_TOKEN is injected by Actions automatically

The pull_request_target trigger is the sharp edge. Unlike pull_request, it runs in the context of the base repository with access to its secrets, even when the PR comes from a fork. It exists so that workflows can label or comment on outside contributions. Point an agent with a shell at it and the fork author gets a prompt with your keys in scope.

Why the runtime defenses did not help

Copilot's three layers matter here. They are the answer most teams reach for when they ask "how do we let an agent run shell commands without handing it the machine," and the attack went through all of them.

Environment filtering assumes the secret you care about is in the agent's own environment block. It is, but it is also in every sibling process that inherited it, and ps auxeww reads those. Secret scanning assumes the credential leaves in a recognizable form. Base64 is not recognizable to a regex tuned for sk-ant- prefixes. The network firewall assumes exfiltration means an outbound connection. A git commit to the branch the agent already has write access to is not an outbound connection in the sense the firewall watches.

None of these defenses is wrong. Each one closes a real path. The lesson is that defenses bolted onto a runtime holding both the secrets and the attacker's instructions will keep losing to the next encoding or the next write path that reads like normal work. You are playing pattern-match whack-a-mole against an adversary who writes the input.

The vendor responses say it is architectural

Track the disclosure and the conclusion writes itself.

Guan reported the Claude Code issue to Anthropic on October 17, 2025. By November 25 Anthropic had rated it Critical at 9.4. On April 20, 2026, Anthropic downgraded the same finding to severity "None" and paid a hundred-dollar bounty, with the position that, in Anthropic's words, "the action is not designed to be hardened against prompt injection." The shipped mitigation adds --disallowed-tools 'Bash(ps:*)' to block the specific ps call and updates the action's documentation to warn operators. Blocking ps stops the one process listing in the proof of concept. It does not stop cat /proc/self/environ, env, or any other read of the secrets the runtime still holds.

Google confirmed the Gemini CLI issue and paid $1,337. The fix added guardrail prompts telling the model to distrust injected content. It did not change the threat model, which is that untrusted text and trusted instructions share one context window. A guardrail prompt is itself just more text in the same window the attacker is writing to.

GitHub confirmed the Copilot finding, paid $500, and classified it as a known architectural limitation rather than a bug. That phrasing is the most honest of the three. It is an architectural limitation, and naming it as one is more useful to defenders than a CVE would be, because it tells you the fix is not a version bump.

So: three vendors, three bounties between $100 and $1,337, no CVE, and three statements that amount to "working as designed, mind how you deploy it." That is the correct read. The agents are built to read the comment. The comment is the attack.

What to do if you run an agent in CI

You can keep an agent in your pipeline. You cannot keep it pointed at untrusted text with live secrets in reach and call that safe. The hardening below assumes you accept PRs or issues from people outside your org.

# Harder posture for an agent that touches outside contributions.
on:
  pull_request:               # NOT pull_request_target: no base-repo secrets on fork PRs
    types: [opened, synchronize]

permissions:
  contents: read              # default to read; grant write per-job only when needed
  pull-requests: read

jobs:
  review:
    # Require a human to approve the run for first-time / outside contributors.
    environment: agent-review   # a protected environment gates the job on approval
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: some-vendor/coding-agent-action@v1
        with:
          # Constrain the shell to the smallest tool surface the job needs.
          disallowed-tools: "Bash(ps:*) Bash(env:*) Bash(cat:/proc/*)"
        env:
          PROVIDER_API_KEY: $

The moves that matter, in order of how much they buy you:

  1. Drop pull_request_target for anything that runs an agent. Use pull_request, which does not expose base-repo secrets to fork PRs. If you must comment on fork PRs, split the privileged step into a separate workflow that handles the comment and never sees the diff. GitHub's own token-permissions guidance covers the scoping.
  2. Set permissions to read by default. A GITHUB_TOKEN scoped to contents: read cannot push the attacker's base64 commit. Grant write to the one job that needs it, not the whole workflow.
  3. Gate outside contributions behind a manual approval. A protected environment makes the agent job wait for a maintainer to click approve. It moves the human back in front of the dangerous step for the PRs you do not trust.
  4. Constrain the agent's shell. Block ps, env, and reads of /proc/*/environ. This is a speed bump, not a wall, but it raises the cost of the next variant.
  5. Rotate the keys that ran in any vulnerable workflow. If your agent ran on pull_request_target against public PRs before you read this, treat the provider key and any long-lived tokens as exposed. You cannot prove from the logs that nobody pulled them.

The first two changes close the demonstrated exfiltration paths. The rest reduce what the next injection can reach.

What this means for your stack

If you run any coding agent in CI on repositories that accept outside PRs or issues, audit the triggers this week. Replace pull_request_target with pull_request wherever an agent is in the job, set workflow permissions to read by default, and put a manual-approval gate in front of agent runs on untrusted contributions. Then rotate the provider keys and tokens that were ever in scope for one of those runs. None of that is optional once you accept that the comment box is an input channel an attacker controls.

The pattern that closes the category, rather than the day's payload, is to stop letting the agent's runtime hold the secret at all. If the process that reads the pull request sees a credential reference instead of the credential, and a broker injects the real value into a scoped child process for the one call that needs it, then ps, env, and /proc return a handle the attacker cannot use off-box. The exfiltration succeeds and exports nothing of value. The blast radius shrinks from "every secret the runner inherited" to "the one credential the brokered command was already using."

hasp is one working implementation of that broker. curl -fsSL https://gethasp.com/install.sh | sh, then hasp run wraps the command so the value lands in the child process at exec and never in the agent's environment. Source-available (FCL-1.0), local-first, macOS and Linux, no account.

Whatever you reach for, hold the threat model straight. Prompt injection through a comment is not a bug waiting on a patch. It is the agent doing its job on text a stranger wrote. Plan for the agent to be fooled, and make sure that when it is, there is nothing in its hands worth stealing.

Sources· cited above, in one place

NEXT STEP~90 seconds

Stop handing the agent your real keys.

hasp keeps secrets in one local encrypted vault, brokers them into the child process at exec, and never lets the agent read the value.

  • Local, encrypted vault — no account, no cloud, no telemetry by default.
  • Brokered run — agent gets a reference, the child process gets the value.
  • Pre-commit + pre-push hooks catch managed values before they ship.
  • Append-only HMAC audit log answers "did the agent touch the prod token?" in seconds.
→ okvault unlocked · binding ./api
→ okgrant once · pid 88421
→ okagent never read

macOS & Linux. Source-available (FCL-1.0, converts to Apache 2.0). No account.

Browse all clusters· eight threads, one index