Anthropic appears ready to ship one of the more interesting product experiments we've seen from a major lab in recent memory. References spotted inside Claude's settings point to a new AI Fluency surface where users can request a personal scorecard that scans their activity across Chat, Cowork, and Claude Code sessions, scores each against defined behavioral indicators, and spits out a structured report—all viewable directly from the settings panel. The feature transforms February 2026 research into something you can actually use on your own conversations.
Where This Came From
The scorecard traces back to Anthropic's AI Fluency Index, published in February 2026 alongside academics Rick Dakan and Joseph Feller. That study analyzed roughly 9,830 anonymized Claude conversations to baseline how people collaborate with AI today. The researchers tracked eleven behaviors—things like whether users clarify goals before asking for help, iterate on outputs through follow-ups, or push back when the model's reasoning seems off. What they found: iteration and refinement were the strongest predictors of good outcomes, while users who relied heavily on polished artifacts and code tended to skip critical checking steps. Interesting data point there—making things look finished might actually make you less rigorous.
The 11 Indicators
The scorecard evaluates three competency pillars derived from Anthropic's '4D AI Fluency Framework': Delegation (clarifying goals, consulting on approach), Description (defining audience, specifying format, building iteratively, providing examples), and Discernment (checking facts, noticing flawed reasoning, recognizing context). Each behavior gets marked as demonstrated [+] or partial [~], with supporting quotes pulled verbatim from your actual messages. The result lands as a fraction—say, 7.5 out of 11—with guidance on which habits to strengthen. A single terse prompt can genuinely score multiple indicators at once: 'ELI5' signals both audience and format; 'less corporate' hints at tone AND target reader.
Product Feature Tracking
Beyond behavioral scoring, the system also logs deterministic feature usage from the last 30 days—projects, artifacts, web-search, research, connectors, skills, memory, MCP tools. The sample visualization shows a user with 45 conversations across Chat (42), Cowork (2), and Claude Code (1), featuring zero use of memory, sports, weather, maps, recipes, subagents, or computer-use. This gives Anthropic visibility into which capabilities users actually adopt versus ignore—a goldmine for product decisions if they decide to act on it.
The Bigger Play
This fits a broader push to position Claude as a skill people develop rather than just a tool they use. The Anthropic Academy, the AI Fluency course series, and partnerships with PayPal, GivingTuesday, and university programs all point toward an education-and-growth narrative Anthropic's betting on heavily. No timeline for rollout has surfaced yet, and it's unclear whether this lands for all tiers or rolls out to enterprise first.
Key Takeaways
- Scorecard tracks 11 behavioral indicators across Chat, Cowork, and Claude Code sessions
- Grounded in research from February 2026 AI Fluency Index analyzing ~9,830 conversations
- Results presented as fraction (e.g., 7.5/11) with specific guidance tied to your actual message history
- Also logs feature usage—projects, artifacts, web-search—to show what you actually use versus ignore
- No launch date confirmed; rollout scope (free vs paid vs enterprise) still unclear
The Bottom Line
This is a clever move by Anthropic—instead of just measuring model quality, they're grading the human side of the conversation. Whether users will embrace being scored on their prompting habits remains to be seen, but it signals something important: the lab thinks AI fluency matters enough to productize. That's worth watching as these models get more capable and the delta between power users and casual ones keeps widening.