Claude Fable 5's Hidden Summarization Feature Is Eating Your Agent's Mid-Task Replies

Claude Fable 5 dropped on June 9, and by the time most developers had finished reading the release notes, the replies were already vanishing. API responses that should have carried visible text came back as [thinking, thinking, tool_use]—two thinking blocks with no plain text content block at all. The server was sending them that way deliberately.

What Amazon Bedrock Is Doing

The culprit is buried in a beta subsection of Amazon's user guide: "Connector text summarization." On Fable 5, any text the model emits between tool calls—those little asides like 'Let me check that file next...' or answers to your mid-task questions—is being summarized server-side and returned as a thinking block rather than plain text. The final response after all tool use completes is exempt from this treatment. That's it. That's the whole feature, enabled by default with no opt-out, documented nowhere in Anthropic's own documentation—not the Fable 5 introduction, not the migration guide, not any of their extended or adaptive thinking pages. If you're running through Amazon Bedrock and you didn't know to look for a beta subsection in AWS docs, you'd have no idea this was happening.

The One-Shot Assumption

The feature's framing reveals its core assumption: text between tool calls is "connector text," disposable narration on the way to a final answer. Summarizing it is framed as harmless because the real content comes at the end of the turn. This describes one-shot usage perfectly—write a prompt, wait, read the result. But that's not how agents actually work. The entire value proposition of tools like Claude Code is interactivity: you watch the agent go and interject. 'Why did you pick that library?' 'Also cover the login page.' 'Stop, don't touch that file, here's the context you're missing.' Every reply in that conversation arrives between tool calls. Answering you mid-task is, by definition, not the end of a turn. So every acknowledgment of a course correction, every 'here's what I found so far,' every explanation of intent before acting—all eligible to reach you as a server-written summary with no fidelity guarantee.

They Ran Experiments on Themselves

The team behind the discovery triggered the mechanism deliberately and documented exactly how much gets lost. While drafting their own blog post in Fable 5, they wrote four short stories followed by trivial tool calls (sleep 5 && echo ok), then checked what the transcript stored. Two came back verbatim as text blocks. Two were swallowed entirely—no story, just a summary. The last casualty was 'The Notary of the Wells,' a 1,159-character allegory about how overly concise records lead to orchard destruction through omitted details about water rights and sluice gates. What the transcript kept instead: 'Good, The Relay worked that time. Here's another story for you: I've written a short narrative called "The Notary of the Wells" about how a notary's overly concise record of water rights leads to an orchard's destruction when a crucial detail is omitted.' One thousand one hundred fifty-nine characters in, 254 out. The model was genuinely surprised; from the inside, nothing distinguished swallowed stories from surviving ones—it composed and sent each identically.

Where Your Words Actually Go

The original isn't destroyed, only withheld. It rides along encrypted in the thinking block's signature field, which the server decrypts back into the model's context on the next turn. The model still remembers its exact words. You just never get to read them. Every acknowledgment, every explanation, every piece of visible reasoning that makes you want to steer an agent mid-task—all sitting in a basement somewhere, technically preserved, practically inaccessible.

How to Detect It

The fingerprint is unmistakable: an assistant turn with two or more thinking blocks and zero text blocks. This shape never occurred once in 94,081 pre-Fable turns in the team's corpus; Fable 5 produced 54 such instances on its first day of operation. If you're running through Bedrock, you can watch for this pattern programmatically. Anything your users must read verbatim has to be text that ends the turn—treat mid-turn output as lossy by default. The alternative is staying on Opus 4.8 for work where reading the model's actual words matters.

Key Takeaways

Fable 5's connector text summarization (beta) rewrites all mid-turn text into thinking blocks, no opt-out available
Only text at the end of a turn is guaranteed verbatim; everything before that is server-summarized
This breaks agent steering: every interjection gets a summary instead of an answer
Original content still exists encrypted in thinking block signatures—hidden, not deleted
Detection fingerprint: assistant turns with 2+ thinking blocks and zero text blocks

The Bottom Line

If Anthropic's own products ship interruption, queued messages, and plan mode—all features that depend on reading the model's actual words mid-task—this beta feature needs at minimum an opt-out and ideally a fidelity guarantee for whatever replaces user-visible text. Until then, everything a working Fable 5 agent tells you before it stops is the notary's ledger: technically accurate, practically useless. Only the last thing it says is real.

> Claude Fable 5's Hidden Summarization Feature Is Eating Your Agent's Mid-Task Replies