MCP threat model
This page mirrors docs/security/mcp-threat-model.md from the
repository, rendered for the docs site so procurement reviewers can
deep-link without checking out the repo.
Purpose
Section titled “Purpose”Threat model for coverctl mcp serve, the trust boundaries around MCP
inputs and outputs, and the hardening controls in place to reduce
prompt-injection-to-code-execution risk.
System boundary
Section titled “System boundary”- Entry point: MCP tool/resource requests over stdio.
- Primary component:
internal/mcp/server.go. - Input control surface:
internal/mcp/sanitize.go(input validation/sanitisation for untrusted MCP fields). - Output control surface:
internal/mcp/sanitize_output.go(canonicalisation of user-controlled strings flowing back to agent).
The Lethal Trifecta
Section titled “The Lethal Trifecta”Per Simon Willison’s framing, agents fail when three properties combine:
- Access to private data — coverctl reads coverage profiles.
- Exposure to untrusted content — coverage profiles can carry attacker-controlled strings (filenames in hostile PRs, weaponised test names, profile-derived paths).
- Ability to exfiltrate externally — the agent’s context window is the exfiltration channel; anything reaching the agent can be smuggled out via subsequent tool calls or replies.
coverctl breaks the Trifecta by hardening the boundaries on both ingress and egress of MCP traffic.
Input boundary controls
Section titled “Input boundary controls”1) Path scoping and validation
Section titled “1) Path scoping and validation”Scoped path validation applied to MCP path inputs before use. Rejected
inputs return a structured rejection response (passed=false, explicit
error, safe summary, agent-actionable remediation).
2) Build-flag sanitisation
Section titled “2) Build-flag sanitisation”internal/mcp/sanitize.go blocks dangerous argument classes for
MCP-originated inputs:
- Dangerous long flags:
--rootdir,--cov-config,--init-script,--require,--node-options,--manifest-path,--target-dir,--inspect,--experimental-loader, etc. - Dangerous short prefixes:
-D,-I,-P. - Shell metacharacters and control characters in free-form arg inputs.
- Invalid tag and timeout formats.
3) Stable rejection schema
Section titled “3) Stable rejection schema”Every input rejection emits the same JSON shape:
{ "passed": false, "error_code": "INPUT_REJECTED_DANGEROUS_FLAG", "error": "rejected MCP input testArgs[0]=\"--rootdir=/tmp\": ...", "summary": "Rejected unsafe MCP input", "remediation": "Remove the rejected flag from testArgs. ..."}Schema is append-only — new codes may land; existing codes will not be renamed without a major-version bump. See rejection schema reference for the full code table.
Output boundary controls
Section titled “Output boundary controls”MCP responses flow back into the agent’s context. If a coverage profile contains attacker-controlled strings, those strings become a new prompt-injection vector — the return-trip half of the Lethal Trifecta. coverctl closes this with output canonicalisation.
- File paths in tool outputs (
files[].file,improved[].file,regressed[].file,items[].name,domainDeltaskeys,domains[].domain) are restricted to[A-Za-z0-9._/-]. Any other character is replaced with?. Paths longer than 256 characters are truncated. - Free-form strings (
summary,error,warnings[]) have control characters (NUL, CR, LF, tabs) replaced with a single space, backticks rewritten to single quotes, and length capped at 1024 bytes. - Sanitisation is idempotent and applied at every handler that
emits user-controlled strings:
check,report,compare,debt. Helpers live ininternal/mcp/sanitize_output.go.
Fail-closed behavior
Section titled “Fail-closed behavior”Any failed sanitisation returns a rejection; tool execution does not proceed. Rejection responses are deterministic and machine-readable for CI/agent handling.
Explicit boundaries / non-goals
Section titled “Explicit boundaries / non-goals”- CLI calls from a human terminal are not sanitised the way MCP inputs are; the operator is the trust boundary there.
- coverctl does not sandbox downstream language toolchains; it reduces attack surface by constraining MCP-supplied arguments and outputs.
Operational guidance
Section titled “Operational guidance”- Use MCP mode for agent workflows:
coverctl mcp serve. - Prefer local-first execution in trusted repos.
- Keep toolchain dependencies updated.
- Treat repeated MCP rejection spikes as an indicator of prompt- injection attempts or malformed agent prompts.
Residual risk
Section titled “Residual risk”- New or unknown dangerous flags in third-party runners may emerge over
time. Mitigation: maintain denylist updates in
sanitize.go, keep tests current, and monitor rejection telemetry/logs. - Output-canonicalisation is a regex-based control; novel encoding
vectors in coverage profiles may bypass it. Mitigation: adversarial
eval suite (
internal/eval/) gates every release.
Code references
Section titled “Code references”internal/mcp/server.gointernal/mcp/sanitize.go(input boundary)internal/mcp/sanitize_output.go(output boundary)internal/mcp/sanitize_test.gointernal/mcp/sanitize_output_test.gointernal/eval/scenarios/(adversarial regression corpus)