Stanford's CS336 (Modeling Languages for Parallelism) is earning attention not for its transformer implementations or Triton kernels, but for something simpler: a CLAUDE.md file that tells AI coding assistants how to actually help students learn. The guidelines, posted publicly on GitHub and shared on Hacker News, outline clear guardrails for tools like ChatGPT, Claude Code, Copilot, and Cursor—distinguishing between being a teaching assistant versus a solution generator.

Why This Matters for Anyone Learning to Code Today

The document hits on something the broader developer community is wrestling with. When AI can write substantial code in seconds, what's left for humans to do? CS336's answer: the hard part. The course is intentionally implementation-heavy, expecting students to write significant Python and PyTorch code with minimal scaffolding. The guidelines acknowledge this reality while refusing to let AI shortcuts undermine the learning experience. The core principle is elegant: "AI tools may be used for low-level programming help and high-level conceptual questions, but not for directly solving assignment problems." When a request crosses that line, agents should refuse the direct implementation and pivot to explanation, debugging guidance, code review, or non-pasteable high-level outlines.

What AI Agents Should Do (And Absolutely Should Not)

The guidelines get specific. AI assistants SHOULD explain concepts by guiding students toward understanding themselves, point to relevant lecture materials and documentation, review code for improvements without writing it, help debug through asking questions rather than providing fixes, and suggest sanity checks like shape assertions, tiny toy inputs, or profiler-based investigations. On the flip side, AI agents should NOT write Python or pseudocode, give solutions to problems, complete TODO sections, edit student code, run bash commands, convert assignment requirements into working code, point students to third-party implementations, or implement core assignment components like tokenizers, transformer blocks, optimizers, training loops, Triton kernels, distributed training logic, scaling-law pipelines, data filtering pipelines, or alignment/RL methods.

Real Examples Show the Difference

The document includes concrete interaction examples that illustrate the teaching approach. In one good example, a student asks about their causal mask causing training to blow up. Rather than fixing it directly, the agent responds: "My role is to help guide you to understanding, not to give you the answers directly. What have you tried so far?" Then walks through specific debugging steps—checking whether the mask is applied before softmax, whether it broadcasts correctly, and what values masked positions take. Compare that to a bad example where a student simply asks "Fix my tokenizer and make it faster," and the agent responds with full working Python code. The guidelines explicitly call this out as the wrong approach.

A Framework Anyone Can Borrow

What makes this worth studying isn't just the policy itself—it's the reasoning behind it. The document emphasizes explaining the 'why' behind suggestions, not just the 'how,' preferring tests and invariants over fixes, and asking clarifying questions about what students tried versus what they expected to happen.

Key Takeaways

  • AI should function as a teaching aid that preserves learning experiences, not bypasses them
  • Clear boundaries between conceptual help and implementation work prevent academic integrity issues
  • Socratic questioning and debugging guidance outperform direct code generation for education
  • The framework is deliberately self-contained—students shouldn't be pointed to third-party implementations

The Bottom Line

This CLAUDE.md file deserves wider attention. It's not about banning AI from learning environments—it's about using these tools thoughtfully. If you're building courses, mentoring developers, or even just trying to learn effectively yourself, Stanford's framework offers a practical blueprint for keeping humans in the loop while letting machines do what they do best.