The Agent Passport Layer Has a Security Problem: It's Testing Itself

A new layer is solidifying in the AI agent stack with a name that finally fits what it does: pre-action authorization. The pattern is clean — before an agent executes any tool call, a deterministic policy engine intercepts it, checks against declarative rules, and signs an audit record. Model proposes; gateway disposes. This isn't theoretical anymore. It's shipping in production form through systems like the Agent Passport System (APS), which uses Ed25519 identities, scoped delegation that narrows at every transfer point, and a three-signature action chain. NSA's June MCP advisory says the same thing from the defensive side: deny-by-default, scope everything, sign every message. The architecture has converged across multiple independent implementations.

Self-Attestation Isn't Security

Here's where it gets uncomfortable. Every implementation in this layer is being tested by the people who built it. OAP reports that social engineering succeeded 74.6% of the time against a bare model but hit 0% across 879 attempts when using their restrictive policy — impressive numbers, until you read the limitations section. The attacks were self-selected and skew toward social engineering rather than protocol-level exploits. It was a self-run bounty by the spec authors against a crowd that chose to participate. APS goes further and explicitly states "A valid signature is not a valid claim" in its own README — acknowledging that cryptographically perfect receipts must still be rejected for wrong claims, expired delegation, or revoked delegation. The team clearly understands the gap between signing something and trusting it. But their conformance suite? It only verifies byte-level interoperability — proving two implementations canonicalize identically, never that either one is actually secure against a determined attacker.

Conformance Proves Agreement, Not Resistance

The testing problem here has two flavors, and neither addresses what matters most. First: self-run adversarial evaluations tied to the implementation being graded. Second: byte-level conformance suites that prove systems agree with each other without proving either system resists attack. This is the same trap TLS implementations avoided by facing independent test suites rather than publishing their own interop results as proof of security. Payment terminals submit to PCI labs they don't control. The entire premise of a trust layer is that its trust must be externally verifiable. A passport you grade yourself isn't a credential — it's a name tag with a signature on it.

The Slot Nobody's Filling

NIST's AI Agent Standards Initiative (February 2026) made identity one of three pillars. OWASP's Top 10 for Agentic Applications (2026) added ASI04 for agentic supply chain and ASI07 for insecure inter-agent communication. MCP moved to OAuth 2.1 with RFC 8707 resource-scoped tokens. Every one of these control surfaces will ship with vendor-attached test results claiming they're secure. What's missing is a neutral adversary — a third-party harness that takes any pre-action-authorization gateway, regardless of who built it, and attempts protocol-level bypass, scope-boundary escalation, delegation-chain abuse, and replay. It should score resistance, not self-attested compliance.

What Independent Testing Would Look Like

The author has been building the attacker's half for the layer below this — an Agent Security Harness running 474 adversarial tests against MCP and agent endpoints. It forges elevated OAuth scopes to verify rejection (AUTH-003), plants command-execution canaries in handshakes (MCP-017), and walks delegation chains looking for authority that should have narrowed but didn't. That last category is exactly what the passport layer needs: take a signed delegation, attempt to use it beyond its scope, score whether the gateway holds firm against an adversary who didn't write the gateway. APS's own model states authority can only decrease at each transfer point — a solid principle. Now prove it when someone who's spent three weeks looking for edge cases is on the other side.

Key Takeaways

Pre-action authorization is real infrastructure shipping in agent stacks across multiple implementations
Every current testing regime is self-attested or proves interoperability, not security
The independent adversarial conformance harness for this layer simply doesn't exist yet — by anyone's admission
NIST, OWASP, and MCP have all moved to lock down identity and delegation; the test infrastructure hasn't followed

The Bottom Line

The agent-identity layer is being built fast, but it's grading its own final exam. A passport proves who an agent is. It does not prove that identity can't be weaponized against you. Until someone other than the lock's maker gets to attack it, every pre-action authorization system in production is running on trust — and trust without adversarial verification is just hope with a signature attached.

> The Agent Passport Layer Has a Security Problem: It's Testing Itself