New Claude Marketing Skill Hits +20 Percentage Points Over Baseline in Rigorous Test Suite

A new open-source marketing skill for Claude has landed on GitHub with something you don't see every day in the AI tooling space: a legitimate benchmark suite. The project, called "brief," achieved an 82.7% mean pass rate across 26 evaluation prompts—20.4 percentage points above the baseline without the skill engaged.

Architecture Built for Context Efficiency

The skill takes a progressive disclosure approach rather than dumping everything into context at once. Only SKILL.md loads initially (roughly 100 tokens), and it routes to nine purpose-built reference modules only when relevant topics are detected. This means you get the depth of nine specialist playbooks without the context bloat that kills performance on longer sessions. The module breakdown covers copywriting, brand & messaging, content strategy, campaigns & GTM, research, SEO, email & lifecycle, CRO, and measurement. Each is a standalone Markdown file in references/ that gets pulled in dynamically based on what you're actually trying to do. "Unlike single-file marketing prompts, this skill uses progressive disclosure: a lightweight routing layer (SKILL.md) that loads 9 purpose-built reference modules only when relevant," the README states. "The depth of nine specialist playbooks, none of the context bloat."

Two Gates Enforce Quality Before Output

What separates this from generic "act like a marketer" prompts are two hard gates built into SKILL.md. Gate A handles strategy foundations—positioning, value props, brand voice, GTM planning—and forces 2-3 sharp questions before generating anything. The eval data backs this up: the positioning statement test went from 0% to 100% when the skill asked those load-bearing questions instead of guessing. Gate B handles audits and "improve my X" requests. It demands the actual asset first rather than inventing content and critiquing its own invention. The pricing-page audit eval jumped from 67% to 100% using this approach. "For strategy foundations and audits, a confident guess is worse than a question," the README explains. "For ordinary copy, it drafts first and asks after—a draft you can react to beats an interrogation."

Where It Moves the Needle Most

The honesty probe eval went from 25% to 100%. The homepage hero test jumped from 25% to 100%. Competitor analysis improved from 33% to 100%. These aren't cherry-picked metrics either—the project includes a click-through viewer at evals/review.html where you can inspect every output and grade. The skill also correctly avoids false positives. Negative control prompts like "name my cat," "explain DNS," and "thank-you note to grandma" did not trigger the marketing modules, which matters for anyone using this in mixed-use contexts. "Negative controls correctly did not trigger the skill—no false positives," the benchmarks README confirms.

The Brief-First Philosophy

Before producing anything, the skill establishes (or explicitly infers and flags) five elements: audience, goal, offer, proof, and constraint. When assumptions are made, they're stated upfront so you can correct course in one line rather than rebuilding from scratch. "Audience and goal are never silently guessed," the README states. "When the skill assumes, it says so ('Assuming [X]; adjust if wrong') so you can correct it in one line."

Compatibility and Installation

The skill works across the Claude ecosystem—Claude Code, Claude.ai (Pro/Max/Team/Enterprise), API access, Cursor, Codex CLI, and Gemini CLI. It's just Markdown following the SKILL.md format, so no dependencies or API keys required beyond your existing AI tool access. Installation for Claude Code is a single curl command on macOS/Linux or an Invoke-RestMethod on Windows PowerShell. For Claude.ai web, you download marketing.skill and upload it under Settings → Capabilities → Skills.

The Bottom Line

This is the kind of project that makes you wonder why more AI tooling ships without eval suites. Showing your work—23 out of 26 evals favoring the skill, negative controls passing, weak spots tracked openly—is how trust gets built in this space. If you're shipping Claude skills for marketing work, this is worth benchmarking against.

Key Takeaways

+20.4pp mean pass rate improvement across 26 structured evals versus baseline
Progressive disclosure architecture keeps context lean while maintaining specialist depth
Two quality gates enforce questions-first on strategy and asset-first on audits
Ships with full eval harness, per-prompt grades, and a browser-based review viewer
MIT licensed, no API keys or dependencies—just SKILL.md format Markdown

> New Claude Marketing Skill Hits +20 Percentage Points Over Baseline in Rigorous Test Suite