A new open-source project called Sub-Agent MCP landed on Hacker News this week, offering a production-ready way to implement hierarchical LLM delegation using the Model Context Protocol. The Python server acts as an intermediary between a "parent" LLM—like Cursor's agent—and one or more specialized sub-agents, each defined in YAML with their own model, system prompt, and downstream MCP tool access.
How It Works
The architecture is elegant: at startup, Sub-Agent MCP reads agents from config/agents.yaml and registers each as an MCP tool named by its ID. When the parent LLM calls a tool (say, "researcher"), the server instantiates that sub-agent's LangChain runtime, connects to any configured downstream MCP servers, executes the reasoning loop, and returns the result as {"response": "..."}. The transport layer uses Streamable HTTP exclusively—no stdio or legacy SSE. Each agent can have a completely different LLM provider, model ID, temperature settings, and tool allowlist.
Configuration Without Code
The YAML-based configuration is where this project shines for operators. Agents get their own base_uri, api_key (with ${ENV_VAR} substitution), model_id, system_prompt, mcp_servers list, and optional tool_allowlist to restrict what downstream tools each sub-agent can access. A researcher agent might connect to filesystem and search MCP servers with a restricted allowlist, while a writer agent has no external MCP connections at all—completely different capability surfaces from the same parent LLM context window. Pydantic validation enforces schema correctness on startup, so bad configs fail fast rather than silently at runtime.
Security Considerations
API keys never leak into tool descriptions exposed to the parent LLM—the llm.api_key field stays server-side only. Downstream MCP servers can optionally require bearer tokens via headers or dedicated bearer_token fields in the YAML, with environment variable substitution supported for credentials. The project also includes a Docker health check endpoint at /mcp, structured logging via structlog, and CI/CD through GitHub Actions that publishes versioned images to GHCR on git tags matching v0.x.y.
Integration Options
Getting started takes minutes: Docker Compose brings up the Sub-Agent MCP server plus mock filesystem and search MCP servers for testing. Local Python installation works with uv sync --dev or pip install -e ".[dev]". Cursor integration requires adding the server URL to Cursor's MCP settings JSON config—after reloading, you get one tool per agent defined in agents.yaml. Any client supporting Streamable HTTP can connect; stdio transport is intentionally unsupported.
Key Takeaways
- Parent LLMs stay lightweight by delegating specialized tasks to YAML-defined sub-agents instead of loading every downstream tool schema
- Each sub-agent runs as a LangChain 1.x agent with its own model, system prompt, and per-role MCP server connections
- Tool allowlists restrict which MCP tools each sub-agent can access—no need for one-size-fits-all permissions
- Docker images publish to GHCR on version tags (v0.1.2 format), not on every push
The Bottom Line
Sub-Agent MCP fills a real gap in the MCP ecosystem: if you've been stuffing every tool into your agent's context and watching token counts explode, this gives you surgical delegation without abandoning the protocol. The YAML-first config is operator-friendly, Pydantic validation catches mistakes early, and the explicit per-agent boundaries make debugging multi-step workflows actually tractable. Worth adding to your MCP stack.