Capframe just published what might be the most practical security resource for AI agent developers yet: a public leaderboard scoring 87 MCP servers against their "agent authority hygiene" — a framework that measures how safely these tools expose capabilities to autonomous agents. The leaderboard went live on Hacker News yesterday, and it's already sparking the kind of debate that only happens when you give people an objective metric for something everyone was previously eyeballing.
How Capframe's Rule Engine Works
The grading system is refreshingly transparent. A perfect score of 100 means a clean security surface. From there, it gets surgical: Critical findings dock 10 points, High severity issues cost 4 points, Medium costs 2, and Low costs 1. Every finding is deterministic — no human judgment calls, no black boxes. The scoring rules are open-source, which means the community can audit them, fork them, or build their own variants. This transparency is exactly what the MCP ecosystem needs right now as it races toward mainstream adoption. Of the 87 servers scanned, 22 achieved that pristine A100 rating — zero findings across any category. These include heavy hitters like @stripe/mcp (@0.3.3), @notionhq/notion-mcp-server (@2.2.1), @cloudflare/mcp-server-cloudflare (@0.2.0), and the official Elasticsearch, Kubernetes, and Linear integrations. Getting an A100 isn't trivial either — it means every tool in that server has proper input constraints, declared side effects, and no SSRF surfaces.
The Dirty End of the Spectrum
On the other end, several notable servers are dragging down the ecosystem's security reputation. OpenZeppelin Stellar Contracts MCP earned a B96 with a HIGH finding for SSRF surface exposure — its stellar-non-fungible tool accepts unconstrained URL parameters that could let an agent probe internal endpoints like cloud metadata services (think 169.254.169.254). That's the kind of vulnerability that doesn't get noticed until someone's AI agent is exfiltrating AWS credentials in production. The server-slack MCP (@2025.4.25) and server-gmail-autoauth-mcp (@1.1.11) both scored B96 with HIGH findings flagged as "excessive agency" — their tool names (slack_post_message, savePath) imply side effects that aren't declared in the schema. A policy synthesizer reading these tools can't generate safe rules because it literally cannot tell what they actually do to the system.
The Unconstrained Input Epidemic
The most common finding across lower-scoring servers is embarrassingly simple: unconstrained string parameters with no maxLength. Tools like web_search_exa, search_astro_docs, and dozens of others accept unbounded text inputs, which creates an indirect injection attack surface. An attacker who can influence what gets passed to these tools — through prompt injection or other means — could stuff arbitrary payloads into a search query that the agent then executes blindly. The fix is not complicated. Capframe's own documentation recommends adding maxLength constraints to string properties, using enums for bounded inputs, or constraining with regex patterns. Most legitimate tool inputs fit under a few hundred bytes anyway. This isn't a sophisticated attack vector — it's basic input validation that the MCP ecosystem seems to have largely overlooked in the rush to ship.
Why This Matters for Agent Security
Model Context Protocol is rapidly becoming the de facto standard for connecting AI agents to external tools and data sources. But that expansion comes with a terrifying expansion of the attack surface. When your agent can invoke arbitrary tools across hundreds of MCP servers, you need a way to evaluate which ones are safe to give real authority. Capframe's leaderboard provides exactly that — a standardized, reproducible security assessment that developers can use during integration planning.
Key Takeaways
- 22 of 87 scanned MCP servers achieved perfect A100 scores with zero security findings
- Top performers include Stripe, Notion, Cloudflare, Elasticsearch, and official Model Context Protocol servers
- Most common vulnerability: unconstrained string parameters enabling indirect injection attacks
- SSRF surfaces in OpenZeppelin Stellar Contracts and excessive agency issues in Slack/Gmail integrations are the highest-severity findings
- The scoring engine is open-source, allowing community audit and customization
The Bottom Line
The MCP ecosystem has a security debt problem, but at least now there's a ledger. Capframe's leaderboard gives developers an objective benchmark instead of vibes-based trust assessments — and in the wild west of agentic AI, that's genuinely valuable. Fix those maxLength constraints, declare your side effects, and stop shipping servers that would fail a basic code review.