What if your coding assistant's best quality was what it wouldn't do? That's the question developer zserge explores with Socreates (yes, intentionally misspelled), a Socratic coding agent that will interrogate your code, challenge your assumptions, and roast your variable naming—but under no circumstances will it touch your keyboard. The project dropped May 25 on GitHub, and it's equal parts thought experiment and genuinely useful tool.
The Philosophy Behind the Madness
Traditional coding agents promise to make you a millionaire if you leave them running overnight with the right prompt. They also burn GPU cycles, inject unprompted security vulnerabilities, bloat your codebase, and leave emoji-laden comments that make you question your career choices. Socreates takes a different approach: it's a rubber duck with opinions. You write every line of code yourself; it just asks really annoying questions like 'What happens on line 247 if llm.Chat returns an error?' and 'Did you handle the error properly?' The agent acts as a brutally honest pair programmer who never lets you off easy.
Under the Hood: Four Tools, Zero Code Generation
Socreates runs on a surprisingly simple architecture. The agentic loop is essentially: user types something, LLM responds with text or tool calls, tools execute and feed results back, repeat until final answer. That's it—no sub-agents orchestrating parallel workflows, no elaborate dependency graphs. Just one loop, four tools, no dependencies. Those tools are list_files (filesystem traversal within the workspace), read_file (with line range support and continuation hints like '[150 more lines. Use start=501 to continue.]'), search (recursive regex matching without shell injection vulnerabilities), and run_command (the dangerous one—always asks for confirmation before executing go test or git diff). Path resolution prevents directory traversal attacks by resolving all paths relative to the workspace root.
Managing Context in a Token-Limited World
Token budgets keep zserge awake at night, and with good reason. Without compaction, long conversations exceed model context windows and start costing actual money. Socreates uses a two-pass approach: first, it trims historical tool outputs (keeping only the first 400 characters), then drops complete request/response turns from the head of the conversation history to stay under a ~16K token limit (~64KB of text). Interestingly, orphaned tool responses appear invalid for OpenAI and DeepSeek protocols—the model rejects references with missing IDs—so compaction becomes more aggressive than ideal. The session transcript is stored as JSONL in .socreates/session.json, allowing the agent to continue where it left off after restarts.
What Models Work Best?
Testing covered qwen, llama, gemma, and DeepSeek API with mixed results. Both Ollama (for local models) and OpenAI-compatible APIs support native tool calling—you send a tools array as JSON Schema and receive structured tool_calls in response. The system prompt explicitly forbids code snippets or pseudocode output: 'You NEVER write code. The developer types all code; you ask questions, spot issues, and verify correctness using tools.' Warnings about iteration limits help avoid scenarios where models keep thinking without ever giving a final answer. Tested with rlwrap for line editing support, the CLI stays deliberately minimal—no TUI, no blinking animations, just stdin/stdout.
Key Takeaways
- Socreates challenges decisions rather than writing code—think rubber duck debugging at scale
- Four tools handle file inspection and command execution; path traversal attacks are blocked
- Context compaction uses two-pass truncation to stay under ~16K token budgets
- Session state persists as JSONL in .socreates/session.json for continuity across restarts
The Bottom Line
Socreates is either the most honest coding assistant ever built or a elaborate prank on developers who miss typing code manually. Either way, it's a refreshing counterpoint to agents that promise to build your startup overnight while leaving you with unmaintainable garbage. Sometimes the best AI tool is one that shuts up and lets you do the work—except when you're about to make a stupid mistake.