Anthropic’s unreleased Claude Mythos Preview model has uncovered thousands of zero-day vulnerabilities across every major operating system and web browser, including a 27-year-old flaw in OpenBSD and a 16-year-old bug in FFmpeg, at a total compute cost the company pegged at under $20,000. The discovery, disclosed on April 7, 2026, has triggered an emergency meeting between US Treasury Secretary Scott Bessent, Federal Reserve Chair Jerome Powell and the chief executives of America’s largest banks, and it has reframed an industry argument that until now lived mostly at conference panels: defenders are not losing the AI race because they lack tools. They are losing because the economics of offense have collapsed.
That collapse is the story that matters. Not the model. The cost curve.
The 27-Year-Old Bug That Forced a Rethink
Mythos Preview did something no human research team had managed: it autonomously found and weaponized a remote code execution flaw in FreeBSD’s NFS server using a 20-gadget ROP chain split across multiple packets, according to Anthropic’s April 7 technical disclosure. The same model rebuilt source code from closed-source binaries to find vulnerabilities in unreleased software.
It also surfaced a memory corruption bug in a production memory-safe virtual machine monitor, the kind of finding that, in a normal year, would headline a Black Hat keynote. In April 2026 it was a footnote.
The model reproduced known vulnerabilities and wrote working exploits on the first attempt in 83% of cases. Expert human validators agreed with its severity ratings 89% of the time, and within one severity level 98% of the time.

Why Bessent and Powell Pulled the Bank CEOs Into a Room
On April 7, the same day Anthropic published its disclosure, Bessent and Powell convened the chief executives of Bank of America, Citigroup, Goldman Sachs, Morgan Stanley and Wells Fargo for a closed-door briefing on what Mythos meant for systemic risk in the financial sector, according to a Sullivan and Cromwell briefing memorandum filed April 23. JPMorgan’s Jamie Dimon was the only major banking CEO who could not attend.
The signal was unusual. Treasury and the Fed do not co-host cyber briefings unless they fear a contagion event.
Dimon used his annual letter to JPMorgan shareholders, released April 6, to flag the same concern in less coded language. AI, he wrote, will introduce “serious new risks” including deepfakes, misinformation and cybersecurity vulnerabilities, and the response will require “rigorous preparation in advance.”
The $20,000 Number Every CISO Should Stare At
Anthropic ran its zero-day sweep across roughly 1,000 open source repositories. Total compute spend: under $20,000. That is the structural break.
For the same money a mid-size enterprise spends on a single quarterly penetration test, a model can now scan a thousand codebases and produce working exploits. The asymmetry is no longer about talent scarcity. It is about cycle time and unit economics.
“Engineers with no formal security training have asked Mythos Preview to find remote code execution vulnerabilities overnight, and woken up the following morning to a complete, working exploit.”
That sentence, from Anthropic’s own writeup, is the line every chief information security officer reading this should print and tape to a monitor. It collapses the apprenticeship model that has structured offensive security since the 1990s.
The Patch Gap Nobody Wants to Name
Of the thousands of high and critical-severity vulnerabilities Mythos surfaced, fewer than 1% had been patched by the time Anthropic went public on April 7. That is not a maintainer failure. It is a throughput problem.
Patch pipelines were built for a world in which a single CVE took weeks of skilled human research to find. They were not built for a world in which one model can produce hundreds of confirmed bugs in a weekend.
The implication is uncomfortable. Even with disclosure handled responsibly, the window between discovery and remediation widens whenever the discovery rate outruns vendor response capacity. Anthropic acknowledges this directly, writing that “the transitional period may be tumultuous regardless” of careful release practices.
What “Project Glasswing” Is Actually Trying to Buy
To buy time, Anthropic restricted access through Project Glasswing, a defensive consortium that includes 12 founding members and roughly 40 additional organizations, per the company’s program page. Founding members include Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, Nvidia and Palo Alto Networks.
Anthropic committed $100 million in Mythos usage credits to the consortium, plus $2.5 million to the Linux Foundation’s Alpha-Omega and OpenSSF projects and $1.5 million to the Apache Software Foundation. After the preview period, Mythos will list at $25 per million input tokens and $125 per million output tokens.
Inside “The Last Ones,” the 32-Step Test That Cracked
The UK AI Security Institute ran Mythos Preview through a battery of cyber range simulations, including a benchmark known as The Last Ones, a 32-step corporate network attack scenario estimated to take a human expert about 20 hours, according to the institute’s April evaluation report.
Mythos completed the full chain in 3 of 10 attempts. It averaged 22 of 32 steps. The next-best model, Claude Opus 4.6, averaged 16. On expert-level capture-the-flag tasks, Mythos cleared 73% of challenges, a benchmark no model had touched before April 2025.
- 3 of 10. Full 32-step network takeover completions by Mythos Preview.
- 22 of 32. Average steps Mythos reached before failure.
- 73%. Mythos success rate on expert capture-the-flag challenges.
- 16. Average steps reached by the next-best frontier model.
AISI cautioned that the ranges lack “active defenders and defensive tooling” and impose no penalty for triggering security alerts. In a real network, a noisy attacker draws fire. Mythos was tested in a world without return fire.
Where the Sovereign-Defense Pitch Comes In
That gap, between sterile range and live network, is the seam Israeli unicorn Dream Security is trying to widen. Co-founders Sebastian Kurz, the former Austrian chancellor, and Shalev Hulio, who previously co-founded NSO Group, used the Mythos disclosure to argue that bolt-on AI security is structurally inadequate. They want fully autonomous, on-premise defense systems with no third-party dependencies.
Dream raised a $100 million Series B in February 2025 led by Bain Capital Ventures, valuing the company at $1.1 billion in its funding announcement. The company reports more than $130 million in 2024 revenue, mostly from government and critical-infrastructure contracts.
Its core product is what Dream calls a Cyber Language Model, a family of models trained on logs, configurations, commands and alert text rather than general web data. The pitch is sovereignty: nothing leaves the customer premises, no model is shared, no inference touches a third-party cloud.
A Contradiction Sitting at the Heart of Glasswing
Project Glasswing’s roster reveals an awkward tension. JPMorgan Chase is a founding member, meaning the bank is leaning on Anthropic’s frontier model for defensive testing at the same moment Dimon’s annual letter argues for owning critical risk inside the firm.
The contradiction is not unique to JPMorgan. Every Glasswing partner that operates classified or regulated data faces the same question: does using a third-party frontier model for vulnerability discovery violate the sovereignty principles those firms apply elsewhere?
Mythos Preview itself was reportedly accessed by unauthorized users in mid-April after a third-party contractor’s credentials gave outsiders a path to the model’s location, per a UK government open letter to business leaders. The model designed to find vulnerabilities was, briefly, itself the vulnerability.
How the Numbers Stack Up
| Metric | Mythos Preview | Prior Frontier Models |
|---|---|---|
| Expert CTF success | 73% | Effectively 0% before April 2025 |
| 32-step network takeover | 3 of 10 full completions | None completed |
| Avg steps reached on TLO | 22 of 32 | 16 (Claude Opus 4.6) |
| First-attempt working exploit rate | 83% | Not previously measured at this scale |
| Severity rating agreement with humans | 89% exact, 98% within one level | Not previously measured at this scale |
The table understates one thing. AISI noted that Mythos’s progress on the 32-step test had not plateaued when evaluators cut it off at 100 million tokens of compute. Spend more, and the curve keeps climbing.
Frequently Asked Questions
What is Anthropic’s Claude Mythos Preview model?
Mythos Preview is an unreleased frontier model from Anthropic, disclosed on April 7, 2026, that demonstrated the ability to autonomously find and exploit zero-day software vulnerabilities at a level exceeding most human security researchers. Anthropic restricted access through a consortium called Project Glasswing rather than releasing it broadly.
How many vulnerabilities did Mythos find?
Anthropic reported that Mythos identified thousands of high and critical-severity zero-day vulnerabilities across every major operating system, every major web browser and other foundational software, scanning roughly 1,000 open source repositories for under $20,000 in compute cost. Fewer than 1% had been patched by the date of disclosure.
Why did the US Treasury convene bank CEOs over an AI model?
Treasury Secretary Scott Bessent and Federal Reserve Chair Jerome Powell briefed the chief executives of Bank of America, Citigroup, Goldman Sachs, Morgan Stanley and Wells Fargo on April 7, 2026, because Mythos-class capabilities raise systemic cyber risk for the financial system, not merely operational risk for individual firms.
Who can use Mythos Preview right now?
Access is limited to the 12 founding members of Project Glasswing, including Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, Nvidia and Palo Alto Networks, plus roughly 40 additional critical-infrastructure organizations and open source maintainers.
What does Dream Security do differently?
Dream sells a sovereign, on-premise AI cybersecurity platform built around its Cyber Language Model, which is trained specifically on security data rather than general text. The company’s pitch is that no data, model or inference leaves the customer’s premises, an argument it positions as the alternative to relying on third-party frontier models.
The next several months will test whether the patch-and-disclose model holds. Vendors who absorbed thousands of newly surfaced bugs from Mythos in April now have to ship fixes faster than the next frontier model finds the next batch, and the next model is already in training. The cost curve has crossed; the response curve has not.




Leave a Comment