tech

Anthropic's Mythos AI Model Raises Alarms Over Global Cybersecurity Vulnerabilities

Apr 18, 202611 min read1 revision2,587 words

TL;DR

Anthropic's Claude Mythos AI model has demonstrated unprecedented autonomous cyber capabilities, discovering thousands of zero-day vulnerabilities across every major operating system and web browser — 99% of which remain unpatched. The model's restricted release through Project Glasswing, the White House standoff over government access, and the absence of any legal framework for governing offensive AI capabilities have exposed a structural gap in how democracies regulate frontier AI at the intersection of national security and private enterprise.

In early April 2026, Anthropic announced that its latest AI model, Claude Mythos, had autonomously discovered thousands of zero-day vulnerabilities — previously unknown software flaws — in every major operating system and every major web browser . The model wrote working exploits for 83.1% of those vulnerabilities on the first attempt . It chained four separate flaws into a single browser exploit that escaped both renderer and operating system sandboxes . It found a 17-year-old remote code execution bug in FreeBSD's NFS server that grants unauthenticated root access, and built a 20-gadget ROP chain to exploit it — a task Anthropic says would have taken expert penetration testers weeks .

Then the company refused to release the model publicly.

What followed has become the most consequential confrontation between a private AI company and the U.S. government to date: a Pentagon blacklisting, two federal court battles, and a meeting in the West Wing between Anthropic CEO Dario Amodei and White House Chief of Staff Susie Wiles . The dispute over Mythos has exposed the absence of any legal framework for governing AI systems with offensive military-grade cyber capabilities — and forced an improvised negotiation with global implications.

What Mythos Can Actually Do

The raw performance numbers are stark. On expert-level capture-the-flag (CTF) cybersecurity challenges — competitive exercises where participants find and exploit vulnerabilities — Mythos succeeded 73% of the time . Two years ago, frontier models could not complete beginner-level CTF tasks . Its predecessor, Claude Opus 4.6, managed a 31% success rate on the same expert challenges .

Frontier AI Model Performance on Expert-Level CTF Challenges (% Success Rate)

Source: UK AI Safety Institute

Data as of Apr 7, 2026CSV

The UK's AI Safety Institute (AISI) independently verified these capabilities using a 32-step corporate network attack simulation called "The Last Ones" (TLO), designed to take a human security professional roughly 20 hours . Mythos completed the full attack chain in 3 of 10 attempts and averaged 22 of 32 steps . Claude Opus 4.6 averaged 16 steps . Mythos was the first model to complete the simulation at all .

"The Last Ones" 32-Step Attack Simulation — Average Steps Completed

Source: UK AI Safety Institute

Data as of Apr 7, 2026CSV

Beyond structured benchmarks, Anthropic's internal red-team evaluation documented specific exploit capabilities that prior models could not approach. Mythos produced 181 working JavaScript shell exploits for Firefox vulnerabilities; Opus 4.6 managed two out of several hundred attempts . On Linux privilege escalation, Mythos exploited subtle race conditions and KASLR bypasses — kernel-level protections designed to randomize memory addresses — autonomously . Expert validators agreed with the model's own severity assessments in 89% of 198 manually reviewed cases, with 98% accurate within one severity level .

The cost efficiency compounds the concern. Scanning OpenBSD for vulnerabilities cost under $20,000 in API credits and turned up several dozen findings . Individual complex exploits cost $1,000–$2,000 each .

The Patching Gap

The scale of Mythos's discovery capability has created what security professionals describe as an asymmetry between offense and defense that existing institutions cannot close quickly. At the time of Anthropic's announcement, over 99% of the vulnerabilities Mythos found had not been patched .

Shane Fry, CTO of RunSafe Security, put the problem plainly: "Vulnerability discovery is outpacing patching" . Tal Kollender, founder of cybersecurity platform Remedio, was blunter: finding vulnerabilities faster than they can be fixed "does not make companies more secure" . Patching remains manual — teams file tickets, coordinate across dependencies, test against regressions, and schedule maintenance windows. Kollender warned that defenders face "a race they're not yet equipped to win" for at least the next year .

The Council on Foreign Relations reported that critical infrastructure worldwide — power grids, water treatment systems, hospital networks, financial clearing systems — faces "significantly delayed protection" because non-U.S. entities are not part of Anthropic's controlled-access program . AI scientist Dan Hendrycks observed that Mythos-class capabilities make it "much easier for non-state actors to take down critical infrastructure" . Palo Alto Networks CEO Nikesh Arora warned of "a horde of [AI] agents methodically cataloguing every weakness" in infrastructure globally .

CISA and international partners have issued joint guidance urging critical infrastructure operators to strengthen AI integration in operational technology environments, but the guidance focuses on secure deployment of AI rather than defending against AI-powered attacks . No government agency has published an estimated remediation cost for the class of vulnerabilities Mythos can find.

The Responsible Scaling Policy and the Release Decision

Anthropic's own safety framework, the Responsible Scaling Policy (RSP), is supposed to govern decisions like whether to release a model with offensive cyber capabilities. The RSP classifies models into AI Safety Levels (ASLs), with higher levels triggering more stringent safeguards .

But the RSP's cyber thresholds have always been imprecise. In its February 2026 v3.0 update, Anthropic acknowledged running into a "zone of ambiguity" where models approach danger thresholds without clearly crossing them . The company also restructured some commitments — such as RAND Security Level 4 for model weight security — as "industry-wide recommendations" rather than unilateral obligations, arguing they only make sense if competitors adopt them too . The RAND report itself found that defending model weights against top-tier cyber actors is "currently not possible" and "will likely require assistance from the national security community" .

Anthropic chose not to release Mythos publicly. Instead, it created Project Glasswing, a controlled-access program that extends Mythos to roughly 40 vetted organizations — including Amazon Web Services, Apple, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, and NVIDIA . Anthropic committed up to $100 million in usage credits and $4 million in direct donations to open-source security organizations . The stated purpose is defensive: giving critical infrastructure operators a head start to scan and patch their systems before comparable capabilities become widely available .

Who made this decision? Anthropic has not published a detailed account of the internal governance process. CEO Dario Amodei has been the public voice of the decision. Anthropic co-founder Jack Clark confirmed at the Semafor World Economy summit that the company briefed senior government officials on Mythos capabilities even while simultaneously suing them, saying "the government has to know about this stuff" . Anthropic briefed CISA and the Center for AI Standards and Innovation (CAISI) at NIST before the limited release . No independent external auditor has publicly confirmed reviewing Mythos's offensive cyber capabilities before deployment to Glasswing partners.

The White House Confrontation

The Mythos standoff did not begin with the model itself. In January 2026, Defense Secretary Pete Hegseth issued a memorandum directing all Department of Defense AI contracts to incorporate "any lawful use" language — which would have required Anthropic to remove its prohibition on using Claude for mass domestic surveillance and fully autonomous lethal decision-making . Amodei refused .

When Anthropic's deadline to comply passed on February 27, President Trump directed all federal agencies to cease using Anthropic technology . On March 5, the Pentagon designated Anthropic a Supply-Chain Risk to National Security — the first time this authority, designed for foreign adversaries, had been applied to a U.S. company .

Anthropic challenged the designation in court. U.S. District Judge Rita Lin blocked enforcement, finding it constituted "classic illegal First Amendment retaliation" and calling the Pentagon's rationale "Orwellian" . A D.C. federal appeals court, however, left the blacklisting in place pending expedited review .

On April 17, both sides agreed to meet. Amodei sat down with Wiles and Treasury Secretary Scott Bessent in the West Wing . The administration reportedly sought access to Mythos for Treasury and other civilian agencies through Project Glasswing . The terms — whether the government wants model weights, API access, or something else — have not been disclosed.

Who Else Has These Capabilities?

A central question in evaluating the risk of Mythos is whether restricting it actually changes the threat landscape. The answer depends on how far behind other actors are.

The Council on Foreign Relations reported that OpenAI is "about six months behind Anthropic in building its own advanced AI model with comparable power" . OpenAI itself announced that its own unreleased model posed similar risks and would also not be publicly released . On general reasoning and coding benchmarks, the gap between Mythos and competitors like Google's Gemini 3.1 Pro, OpenAI's GPT-5.4, and Meta's Llama 4 is narrow .

Open-weight models present a separate proliferation vector. A May 2025 paper analyzing open-weight AI cyber risks found that Meta's DeepSeek-R1 "achieved over 90% accuracy on the TACTL-183 benchmark, covering a wide range of cyber knowledge areas relevant to offense" . The authors concluded that once model weights are public, "rate-limiting APIs, hardware enclaves...and monitoring of downstream use are bypassed" . Safety alignments can be "trivially sidestepped" through accessible fine-tuning techniques .

Security researcher Bruce Schneier offered a skeptical framing. He acknowledged that Mythos demonstrates "increased sophistication in their cyberattack capabilities" and writes "effective exploits...without human involvement" . But he cautioned that "finding for the purposes of fixing is easier for an AI than finding plus exploiting" and that the defensive advantage "is likely to shrink, as ever more powerful models become available to the general public" . He was also critical of Anthropic's public messaging, calling it "very much a PR play" and noting that "lots of reporters are breathlessly repeating Anthropic's talking points, without engaging with them critically" .

If comparable capabilities are six months away from multiple developers — and already partially present in open-weight models — then restricting Mythos specifically addresses a narrow and temporary window. Whether that window matters depends on how many vulnerabilities defenders can patch during it.

Does AI Actually Help Low-Skill Attackers?

The claim that AI provides meaningful "uplift" to less-skilled threat actors has become a standard assertion in policy discussions. The evidence is mixed.

Industry reports from Microsoft, Trend Micro, and Google's Threat Intelligence Group document concrete cases. Microsoft reported in April 2026 that threat actors are "embedding AI into how they plan, refine, and sustain cyberattacks" . Google's Threat Intelligence Group documented what it described as "the first large-scale cyberattack executed with minimal human oversight" in September 2025 . AI-generated phishing content and deepfake-assisted social engineering are now common enough that multiple security firms treat them as standard threat categories .

However, the AISI evaluation included an important caveat: Mythos's demonstrated successes came against systems with "weak security posture," and the test environments lacked "active defenders and defensive tooling" . The institute could not assess performance against well-defended production environments. This distinction matters. An AI that can compromise a poorly configured network may still fail against an organization with competent security operations, intrusion detection systems, and incident response teams.

Academic research on this question remains thin. A systematic review of the literature finds over 53,000 published papers on AI and cybersecurity vulnerability since 2011, with output peaking at over 21,000 papers in 2025 . But most are technical capability demonstrations or threat analyses, not controlled experiments measuring the marginal uplift AI provides to attackers of different skill levels.

Research Publications on "AI cybersecurity vulnerability"

Source: OpenAlex

Data as of Jan 1, 2026CSV

The most honest assessment may be that AI compresses the skill and time requirements for certain attack stages — reconnaissance, phishing, initial exploit generation — while leaving the hardest parts of sophisticated attacks (maintaining persistence in well-defended environments, lateral movement under active monitoring, exfiltrating data without detection) still dependent on human expertise. The threat actor profile that benefits most may not be the script kiddie or the nation-state, but the organized criminal group that already has operational infrastructure and needs faster tooling.

The Legal Vacuum

No statute in the United States, European Union, or United Kingdom specifically assigns liability to an AI developer if a released model is used in a cyberattack .

The EU AI Act, fully applicable from August 2026, classifies AI systems by risk level and imposes obligations on providers of "high-risk" systems, including risk management, quality controls, and human oversight . Violations carry fines up to €35 million or 7% of global annual turnover . But the Act focuses on deployment contexts — employment screening, credit scoring, law enforcement — rather than on the offensive capabilities of general-purpose foundation models. The EU's AI Liability Directive, still in development, would shift the burden of proof in negligence claims involving AI but does not create strict liability for developers whose models are misused .

In the United States, no comprehensive federal AI liability framework exists . President Trump's December 2025 executive order signaled intent to establish a "minimally burdensome national policy framework" but created no liability standards . State-level legislation in Colorado and California imposes developer duties around bias audits and safety protocols, but neither addresses cyberattack liability . The Congressional Research Service flagged the Anthropic dispute as raising "potential issues for Congress" but did not propose specific legislative remedies .

The UK has no AI-specific liability statute. General negligence and product liability frameworks theoretically apply, but no court has tested whether releasing an AI model that a third party uses in a cyberattack constitutes negligence by the developer.

This gap means that Anthropic's decision to restrict Mythos is, at present, entirely voluntary. No law required it. No regulator ordered it. And no enforcement mechanism exists to compel similar restraint from competitors or from open-weight model providers who release comparable capabilities without restrictions.

The Counterfactual

If Anthropic had declined to develop Mythos at all — or developed it and permanently suppressed it — would the world be safer?

The evidence suggests not for long. OpenAI is roughly six months behind . Open-weight models already demonstrate significant offensive capabilities . Google, Meta, and xAI are all developing frontier models with comparable general reasoning abilities . The trajectory of AI cybersecurity research, with over 53,000 papers published since 2011 and output growing exponentially through 2025, indicates a field where capability breakthroughs are distributed across many institutions .

Anthropic's argument for Project Glasswing rests on this timeline: if comparable capabilities will exist within months regardless, the highest-value intervention is to give defenders early access rather than suppress the capability entirely. As Anthropic framed it, "most or all of the world's critical software will need to be patched or rewritten" — and the question is whether defenders get a head start .

The counterargument is that each month of delay matters. If Mythos can find vulnerabilities faster than organizations can patch them — and the 99% unpatched figure suggests it can — then even a six-month head start for defenders may not close the gap before comparable offensive tools become available to adversaries .

What Comes Next

The Mythos episode has made visible a set of problems that existed before the model and will persist after it. There is no regulatory body with jurisdiction over frontier AI offensive capabilities. There is no statute assigning liability for AI-enabled cyberattacks. There is no international coordination mechanism for managing the proliferation of AI cyber tools. And the U.S. government's first improvised attempt at asserting control — designating a domestic company as a national security threat for refusing to remove safety guardrails — was rejected by a federal judge as unconstitutional .

The White House talks with Anthropic remain ongoing. The expedited D.C. appeals court case will produce a ruling within weeks. OpenAI's own restricted model will likely force a parallel set of decisions. And open-weight models with meaningful offensive capabilities are already available to anyone with a GPU.

The question is no longer whether AI can find and exploit software vulnerabilities faster than humans. Mythos has answered that. The question is whether any institution — government, corporate, or international — can build governance structures fast enough to manage the consequences.

Sources (18)

[1]
Claude Mythos Preview — Anthropic Red Team Evaluationred.anthropic.com
Anthropic's internal red team assessment documenting Mythos's autonomous vulnerability discovery and exploit development capabilities across major operating systems and browsers.
[2]
White House chief of staff to meet with Anthropic CEO over its new AI technologywashingtontimes.com
Dario Amodei met with Susie Wiles and Scott Bessent at the White House on April 17 in 'productive and constructive' discussions about Mythos access.
[3]
Our evaluation of Claude Mythos Preview's cyber capabilitiesaisi.gov.uk
UK AI Safety Institute found Mythos succeeds on expert-level CTF challenges 73% of the time and completed a 32-step attack simulation 3 out of 10 times — the first model to do so.
[4]
Six Reasons Claude Mythos Is an Inflection Point for AI—and Global Securitycfr.org
CFR analysis reporting 99% of Mythos-discovered vulnerabilities remain unpatched and that OpenAI is approximately six months behind in comparable capabilities.
[5]
Anthropic's Mythos finds software flaws faster than companies can fix themfortune.com
Security experts warn that AI-driven vulnerability discovery vastly outpaces patching, with defenders facing 'a race they're not yet equipped to win.'
[6]
Principles for the Secure Integration of Artificial Intelligence in Operational Technologycisa.gov
CISA and international partners issued joint guidance on securing AI in operational technology for critical infrastructure operators.
[7]
Responsible Scaling Policy Version 3.0anthropic.com
Anthropic's February 2026 RSP update acknowledges a 'zone of ambiguity' in capability thresholds and restructures some safety commitments as industry-wide recommendations.
[8]
Project Glasswing: Securing critical software for the AI eraanthropic.com
Anthropic's controlled-access program providing Mythos to ~40 vetted organizations with up to $100M in usage credits for defensive vulnerability scanning.
[9]
Anthropic co-founder confirms the company briefed the Trump administration on Mythostechcrunch.com
Jack Clark confirmed Anthropic briefed senior government officials on Mythos, stating 'the government has to know about this stuff.'
[10]
Deadline looms as Anthropic rejects Pentagon demands it remove AI safeguardsnpr.org
Defense Secretary Pete Hegseth demanded Anthropic grant 'any lawful use' access including autonomous weapons and mass surveillance; Amodei refused.
[11]
Anthropic loses appeals court bid to temporarily block Pentagon blacklistingcnbc.com
D.C. appeals court left Pentagon's supply chain risk designation in place pending expedited review, while San Francisco judge blocked enforcement.
[12]
On Anthropic's Mythos Preview and Project Glasswingschneier.com
Bruce Schneier acknowledged Mythos's capabilities but called Anthropic's messaging 'very much a PR play' and cautioned the defensive advantage will shrink as models proliferate.
[13]
Mitigating Cyber Risk in the Age of Open-Weight LLMs: Policy Gaps and Technical Realitiesarxiv.org
Analysis finding that open-weight models like DeepSeek-R1 achieved over 90% accuracy on offensive cyber benchmarks, and safety alignments can be 'trivially sidestepped.'
[14]
Threat actor abuse of AI accelerates from tool to cyberattack surfacemicrosoft.com
Microsoft reports threat actors are embedding AI into cyberattack planning, refinement, and execution at increasing scale and tempo.
[15]
The AI-fication of Cyberthreats: Trend Micro Security Predictions for 2026trendmicro.com
Trend Micro reports that barriers to sophisticated attacks have collapsed, with AI enabling motivated individuals to launch attacks previously requiring nation-state resources.
[16]
OpenAlex: AI Cybersecurity Vulnerability Publication Trendopenalex.org
Over 53,000 academic papers on AI cybersecurity vulnerability published since 2011, peaking at 21,046 in 2025.
[17]
2026 AI Laws Update: Key Regulations and Practical Guidancegunder.com
No comprehensive US federal AI liability framework exists; state-level laws address bias and safety protocols but not cyberattack liability.
[18]
EU AI Act 2026 Updates: Compliance Requirements and Business Riskslegalnodes.com
EU AI Act fully applicable from August 2026, with fines up to €35M or 7% of global turnover, but focused on deployment contexts rather than offensive model capabilities.

Anthropic's Mythos AI Model Raises Alarms Over Global Cybersecurity Vulnerabilities

What Mythos Can Actually Do

The Patching Gap

The Responsible Scaling Policy and the Release Decision

The White House Confrontation

Who Else Has These Capabilities?

Does AI Actually Help Low-Skill Attackers?

The Legal Vacuum

The Counterfactual

What Comes Next

Related Stories

White House and Anthropic Hold Talks on Mythos AI Model as Legal Dispute Is Paused

White House Plans to Deploy Anthropic Mythos AI Across Federal Agencies as Finance Ministers Raise Concerns

Anthropic Releases New AI Model 'Mythos,' Raising Safety Questions

Anthropic and OpenAI Move to Restrict Access to Their Latest AI Models

Anthropic Launches Project Glasswing to Counter AI-Enabled Cyberattacks

Sources (18)