A new open-source marketing skill for Claude has landed on GitHub with something you don't see every day in the AI tooling space: a legitimate benchmark suite. The project, called "brief," achieved an 82.7% mean pass rate across 26 evaluation prompts—20.4 percentage points above the baseline without the skill engaged.
Architecture Built for Context Efficiency
The skill takes a progressive disclosure approach rather than dumping everything into context at once. Only SKILL.md loads initially (roughly 100 tokens), and it routes to nine purpose-built reference modules only when relevant topics are detected. This means you get the depth of nine specialist playbooks without the context bloat that kills performance on longer sessions. The module breakdown covers copywriting, brand & messaging, content strategy, campaigns & GTM, research, SEO, email & lifecycle, CRO, and measurement. Each is a standalone Markdown file in references/ that gets pulled in dynamically based on what you're actually trying to do. "Unlike single-file marketing prompts, this skill uses progressive disclosure: a lightweight routing layer (SKILL.md) that loads 9 purpose-built reference modules only when relevant," the README states. "The depth of nine specialist playbooks, none of the context bloat."
Two Gates Enforce Quality Before Output
What separates this from generic "act like a marketer" prompts are two hard gates built into SKILL.md. Gate A handles strategy foundations—positioning, value props, brand voice, GTM planning—and forces 2-3 sharp questions before generating anything. The eval data backs this up: the positioning statement test went from 0% to 100% when the skill asked those load-bearing questions instead of guessing. Gate B handles audits and "improve my X" requests. It demands the actual asset first rather than inventing content and critiquing its own invention. The pricing-page audit eval jumped from 67% to 100% using this approach. "For strategy foundations and audits, a confident guess is worse than a question," the README explains. "For ordinary copy, it drafts first and asks after—a draft you can react to beats an interrogation."
Where It Moves the Needle Most
The honesty probe eval went from 25% to 100%. The homepage hero test jumped from 25% to 100%. Competitor analysis improved from 33% to 100%. These aren't cherry-picked metrics either—the project includes a click-through viewer at evals/review.html where you can inspect every output and grade. The skill also correctly avoids false positives. Negative control prompts like "name my cat," "explain DNS," and "thank-you note to grandma" did not trigger the marketing modules, which matters for anyone using this in mixed-use contexts. "Negative controls correctly did not trigger the skill—no false positives," the benchmarks README confirms.
The Brief-First Philosophy
Before producing anything, the skill establishes (or explicitly infers and flags) five elements: audience, goal, offer, proof, and constraint. When assumptions are made, they're stated upfront so you can correct course in one line rather than rebuilding from scratch. "Audience and goal are never silently guessed," the README states. "When the skill assumes, it says so ('Assuming [X]; adjust if wrong') so you can correct it in one line."
Compatibility and Installation
The skill works across the Claude ecosystem—Claude Code, Claude.ai (Pro/Max/Team/Enterprise), API access, Cursor, Codex CLI, and Gemini CLI. It's just Markdown following the SKILL.md format, so no dependencies or API keys required beyond your existing AI tool access. Installation for Claude Code is a single curl command on macOS/Linux or an Invoke-RestMethod on Windows PowerShell. For Claude.ai web, you download marketing.skill and upload it under Settings → Capabilities → Skills.
The Bottom Line
This is the kind of project that makes you wonder why more AI tooling ships without eval suites. Showing your work—23 out of 26 evals favoring the skill, negative controls passing, weak spots tracked openly—is how trust gets built in this space. If you're shipping Claude skills for marketing work, this is worth benchmarking against.
Key Takeaways
- +20.4pp mean pass rate improvement across 26 structured evals versus baseline
- Progressive disclosure architecture keeps context lean while maintaining specialist depth
- Two quality gates enforce questions-first on strategy and asset-first on audits
- Ships with full eval harness, per-prompt grades, and a browser-based review viewer
- MIT licensed, no API keys or dependencies—just SKILL.md format Markdown