The Agent Identity Layer Has a Trust Problem Nobody's Talking About

A new layer is consolidating in the agent stack, and it finally has a name: pre-action authorization. Before an autonomous AI agent executes any tool call, a deterministic policy engine intercepts it, checks against declarative rules, and signs an audit record. The model proposes; the gateway disposes. It's clean architecture, it's shipping in production form via systems like the Agent Passport System (APS), and according to new research, it's genuinely effective at stopping social engineering attacks. So what's the problem? Nobody outside these implementations is testing whether they actually work against a real adversary.

Self-Grading Isn't Security

The numbers look solid on paper. One implementation called OAP reports that social engineering succeeded against an unprotected model 74.6% of the time, but hit 0% success across 879 attempts when its restrictive policy was active. That's compelling data. But read the limitations section: the attackers "self-select and skew toward social engineering rather than protocol-level attacks; results may not generalize to APT-grade adversaries." It's a self-run bounty against a self-selected crowd. The APS project goes further and explicitly states in its own README that "a valid signature is not a valid claim" — meaning cryptographically perfect receipts can still be rejected if the underlying claim is wrong, expired, or revoked. The team clearly understands the gap between signing something correctly and trusting it legitimately.

Conformance Tests Prove Agreement, Not Resistance

APS ships with a byte-level conformance suite that verifies two implementations canonicalize identically — proving interoperability, not correctness. The README states plainly this "does not replace dynamic test execution." So we have two kinds of testing in this critical security layer: self-run adversarial evaluations tied to the people who built the gateway, and conformance suites that prove systems agree, never that either one is actually resistant. This is like TLS implementations publishing their own interoperability tests as proof of security instead of facing external attack frameworks.

The Slot Is Wide Open

NIST's AI Agent Standards Initiative (February 2026) made identity one of three pillars. OWASP's Top 10 for Agentic Applications 2026 added ASI04 for agentic supply chain risks and ASI07 for insecure inter-agent communication. MCP moved to OAuth 2.1 with RFC 8707 resource-scoped tokens. Every control surface shipping right now comes with a vendor's own test results attached. What this layer lacks is neutral third-party adversarial conformance — a harness that takes any pre-action-authorization gateway, regardless of builder, and attempts protocol-level bypass, scope-boundary escalation, delegation-chain abuse, and replay attacks against it.

The Bottom Line

The agent identity layer is being built fast, and the design has converged correctly. What's missing isn't another gateway implementation — it's the independent adversary that proves any of them actually hold under real attack. A passport you grade yourself is a name tag, not proof of identity. Until someone builds the equivalent of PCI testing labs for pre-action authorization, every signature in this layer is just well-formed wishful thinking.

> The Agent Identity Layer Has a Trust Problem Nobody's Talking About

Self-Grading Isn't Security

Conformance Tests Prove Agreement, Not Resistance

The Slot Is Wide Open

The Bottom Line

> RELATED DISPATCHES