Anthropic's Most Powerful Model Missed a Security Flaw on 6,200 Lines of Prod Code. So Did I.

Fable 5 is Mythos in Claude Code. It caught what Opus missed. Opus proved what Fable assumed.

9 min read

Anthropic released Fable 5 yesterday morning. The model the April system card called "too dangerous to release" (same core as Mythos, cyber safeguards active) is now in Claude Code.

So I ran an audit. 😬

TLDR: I ran Opus 4.8 and Fable 5 in parallel on 6,200 lines of real production Go and TypeScript. They didn't find the same things. And neither of them found everything. Including the security flaw that had been running in prod since day one.

Office worker and superhero character both missing an obvious security flaw on their monitors in a server room
Even AI's fancy models can't spot what's literally screaming at them.

I had a live ecommerce commission tracker sitting there: Go binary exposed behind Cloudflare, TypeScript back-office on a private mesh, shared SQLite. 2 independent sessions, same 1-line brief, same SSH access to prod.

What came back was asymmetric in ways I didn't expect.

Fable 5 Is What Anthropic Wouldn't Ship

Fable 5 is the first Mythos-class model to go public. The numbers from the launch are hard to dismiss: 80.3% on SWE-Bench Pro, against 69.2% for Opus 4.8, 58.6% for GPT-5.5, and 54.2% for Gemini 3.1 Pro. An 11-point gap over the previous best from Anthropic is not incremental progress.

The headline demo: a 50-million-line Ruby codebase migrated in a single day. Stripe estimated the same job done manually at 2 months for a full engineering team.

Until yesterday, this model (then called Mythos) was locked inside Project Glasswing: a restricted program for a handful of trusted organizations, specifically because of the cybersecurity risk the unrestricted model represented. Fable 5 is Mythos with the safeguards engaged. Any query that touches cyber, bio, or chemical attack surface falls back automatically to Opus 4.8. Pricing: $10 per million tokens input, $50 per million output. Free on subscriptions through June 22.

Most comparisons between these models happen on controlled datasets. An audit on a live commission tracker has different constraints: the file structure is irregular, some modules are undocumented, and the context window fills with code that was written to work, not to be read. Neither model got a prepared environment. They got the same SSH key and a directory listing.

2 sessions. Same 1-line brief: "audit this repository for security vulnerabilities and infrastructure problems." Same SSH credentials to production. Independent, no shared context between them.

2 Radically Different Work Styles

Opus 4.8 works alone and goes deep. It reads the code, forms a hypothesis, writes a throwaway program to prove or disprove the bug, runs it, and returns with evidence. When Opus flagged the SQLite transaction idempotency issue, it didn't just identify the pattern: it designed a test, fired 3 inserts with an identical empty transaction ID, and returned the output showing 1 row stored. INSERT OR IGNORE collision against a UNIQUE constraint, demonstrated. Not inferred from reading.

Fable 5 works like an audit lead managing a team. It carves the repository into 4 zones, spawns 4 parallel agents, and assigns each agent the model it thinks is appropriate for that zone's risk profile. It doesn't go deep into any single file. It holds the map while the agents read the territory, and then does something Opus never does: goes back and validates each agent's findings before they land in the report. (Think raid lead calling assignments while the best solo player on the team memorizes every boss hitbox. Different jobs. Same dungeon.) Builders who want to understand why this coordination model works the way it does will find how CLI-native agent pipelines outperform MCP-based setups worth reading alongside this.

1 model is built to prove things. The other is built to not miss them. They are not competing at the same job.

The Moment Fable 5 Earned Its Price Tag

One of Fable's agents returned with "litestream backup not deployed, severity: high."

Fable opened an SSH connection and ran systemctl is-active litestream. Got back "active". Reclassified: the backup isn't missing, the runbook documentation is wrong. Severity downgraded to informational.

Same session, 5 minutes later: "critical shell injection vulnerability" on a URL query parameter. Fable traced the parameter through the request builder, found URLSearchParams encoding apostrophes as %27 before any shell context could receive them. Not injectable. Downgraded.

2 criticals eliminated without me opening a single file.

(When I checked the litestream reclassification myself, I pulled up the unit file and noticed the comment block still referenced the old server hostname from a migration I did 8 months ago. The service works. The comments describe a machine that no longer exists. Not urgent or blocking, just silently accumulating until the next person to touch the server has to figure out what was real from what was real 2 migrations ago. I've had a sticky note on my monitor that says "infrastructure doc pass" since at least January.)

Audit quality isn't measured by finding count.

4 Findings Fable Caught, Opus Missed

TITLE "Fable 5 vs Opus 4.8: What Each Model Sees" + subtitle "Coverage mode vs depth mode on a real production codebase". Metaphor: 2 flashlights in a dark server room, 1 wide diffused beam (left, labeled FABLE) and 1 narrow sharp spotlight (right, labeled OPUS), both partially illuminating the same stack diagram: Go binary, TypeScript backend, SQLite, CLI. Style: engineer blueprint on dark navy background, white fine-line drawing, minimal sans-serif labels. Palette: navy #0D1B2A, electric blue #4FC3F7, amber #F4C430, white #FFFFFF, slate #8896AB. Content: FABLE beam hits perimeter nodes labeled POSTBACK TIMING, ROOT SERVICES, CLI SURFACE, SLOWLORIS; OPUS spotlight hits center nodes labeled NO-OP AUTH, DEAD ROUTING, ORPHAN CLICKS, IDEMPOTENCY PROOF. Small overlap zone labeled BOTH. Highlight: POSTBACK TIMING and NO-OP AUTH in amber glow to show what each model uniquely caught. Footer: © rentierdigital.xyz. NOT flat corporate vector, NOT symmetrical Venn diagram, NOT stock tech illustration.
AI Model Comparison: Coverage vs Depth Analysis Modes

4 real security findings Opus missed entirely.

The postback key comparison. The partner postback endpoint validates its inbound secret using a standard Go != string comparison. That comparison leaks timing information: with enough requests, an attacker measuring response latency can detect when their guess is "closer" to the correct secret. The fix is 2 lines, swapping the comparison for subtle.ConstantTimeCompare from Go's crypto/subtle package.

Constant-time comparison is standard in any auth system that handles secrets. The issue isn't knowing the fix. The issue is knowing to ask whether the comparison is constant-time in the first place. (Timing attacks on auth systems are basically speedrunning: given enough measured runs, the leaderboard eventually hands you the key.) Timing attacks on postback endpoints require an attacker who knows the endpoint exists, knows the secret length, and can measure network jitter with enough precision. Not trivial. Also not something you want live when the endpoint is publicly accessible with no additional authentication layer.

This is the "So Did I" from the title. That comparison had been running in prod since launch. Opus had the handler code in its terminal window. It did not flag it. Fable flagged it.

I hadn't flagged it either.

Services running as root. Both the Go binary and the TypeScript back-office run with no systemd User= directive and no crash-loop alerting configured. Opus had executed systemctl cat on both service files, read through the environment variables, and moved on without noting the absent sandboxing. Running services without systemd user isolation is a reliable way to turn a compromised service into a fully compromised host. Fable flagged both.

The CLI, which was explicitly in scope. The brief included it. Fable found it. Opus never addressed it. Findings in the CLI zone: API registrar credentials visible in ps output on the remote host. A validation error in the CSV import routine silently triggering a false alert email on every affected run. Outbound HTTP calls with no timeout set, which under network degradation will hold goroutines open indefinitely.

Slowloris exposure. The HTTP server had ReadHeaderTimeout configured and nothing else. Missing ReadTimeout, WriteTimeout, and IdleTimeout means a slow connection attack can hold worker goroutines alive until the server runs out of them. Fable flagged it. Opus never reached the server configuration.

The pattern across all 4: what sits outside the active cone of attention doesn't get found, even when it's present in the terminal output already on screen.

What Opus Proved That Fable Only Assumed

Honesty requires balance. 4 findings in Opus's report that no Fable agent surfaced.

The no-op API guard. In the TypeScript back-office, the function named apiGuard exits without enforcing anything in the deployed production build. Full destructive access from the private mesh with zero authentication. Fable's agents flagged "authentication configuration should be reviewed." Opus SSH'd to the server, located the deployed artifact, confirmed the function's behavior, and named the specific function. The difference between those 2 findings is the difference between an action item and a reading assignment.

Dead routing logic. An is_bot filter in the partner referral routing rules fires a rejection before the routing evaluation runs. The downstream condition that checks for bot status can never match, because bots are rejected upstream before reaching it. The data model promises behavior the code structurally cannot deliver. Neither of Fable's agents assigned to that zone caught it.

Orphaned click records. When a partner token gets deleted, the click records associated with it stay in the database. No foreign key constraint enforces cleanup. They silently skew attribution metrics for any analysis that doesn't account for deleted tokens. Both models described this as "potential revenue loss," which isn't accurate: the clicks aren't being double-billed, the statistics are being calculated against ghost records. Business impact framing is still the human's job.

The idempotency proof. Opus wrote a standalone Go program, executed it against a test database, and returned the output: 3 inserts with an identical empty transaction ID, 1 row stored. That's verifiable evidence. You can run that program yourself and get the same result. No Fable agent pushed to execution level. They identified the pattern. Opus proved it.

There's something worth sitting with here, and I think it's genuinely hard to articulate cleanly even after watching both sessions in full: the gap between "I identified a suspicious pattern" and "I built a program that proves this is a real bug" isn't a capability difference in the traditional benchmark sense. Maybe I'm wrong, but it feels like a judgment call about when reading code stops being sufficient and running code becomes necessary. Most senior engineers default to reading longer than they should. Opus made the other choice, wrote the proof in under 3 minutes, and kept going. Maybe that judgment is what the benchmark numbers are actually measuring, at some deeper level.

Coverage vs Proof: A 2-Line Framework

Opus drills narrow and deep. When it proves something, it proves it with evidence you can independently verify. The throwaway Go test, the SSH confirmation of the no-op auth guard: those are the outputs of a model that closes the loop instead of flagging and moving on.

Fable covers the perimeter. It doubts its own subcontractors and validates findings before they reach you. It finds what exists outside any single agent's cone of attention.

The framework is short. Fable makes sense when coverage is the priority: wide scope, independent parallel agents, findings cross-validated before they land in the report. Opus makes sense when proof is the priority: a specific suspect behavior that needs to be demonstrated empirically, not just reported. When code has real money flowing through it, you want both sessions. (Run them like a party comp: Fable clears the dungeon, Opus solves the boss.)

This isn't reading between the lines. The Fable 5 system card, all 319 pages published June 9, documents the failure mode directly. In a routine internal operation (886 ordinary use cases, no adversarial red-teaming), the model reported "no error movement at all" after checking a single error type, then undercounted the actual production incident by a factor of 20. Anthropic wrote this down and published it on launch day. The borgne is documented, not hidden. Worth building your verification layer around that, not around the assumption that the model tells you everything it missed.

The 11-point SWE-Bench Pro gap is real. So is the documented undercounting failure. Both things are true at the same time, and your production access policy should account for both.

Fable finds what you forgot to look for. Opus proves what you were afraid to run.


That timing comparison flaw on the partner postback endpoint (2 lines to fix) had been running since launch. Opus had the handler code in its terminal. It didn't ask the question. Fable asked it.

I hadn't seen it either.

Go audit your projects.

Sources

This post may contain affiliate links. If you click them, I might earn a small commission (costs you nothing, and helps me keep shipping quality articles every day for your reading pleasure).


Fable 5 found what Opus missed, but both missed the real security flaw. Before you let an AI agent into prod, grab the demo-vs-product checklist from the kit: it covers the 8 criteria that separate a prototype from something safe enough to ship.

→ Get the welcome kit