tech

Anthropic Releases Mythos AI Model to Public with Cybersecurity Capabilities Removed

Jun 10, 202611 min read1 revision2,405 words

TL;DR

Anthropic released Claude Fable 5 on June 9, 2026 — a public version of its Mythos-class model with cybersecurity and biology capabilities stripped out and rerouted to a weaker model. The full-strength Mythos 5, capable of autonomously discovering and exploiting zero-day vulnerabilities in every major operating system and browser, remains restricted to roughly 200 vetted organizations through Project Glasswing, raising sharp questions about who gets to wield AI-powered security tools, whether capability removal actually works, and whether the approach disadvantages Western defenders against nation-state adversaries who face no such restrictions.

On June 9, 2026, Anthropic released Claude Fable 5, the first publicly available model built on the same architecture as Mythos — an AI system the company had declared too dangerous for general release just two months earlier . The catch: Fable 5 has its cybersecurity and biology capabilities gutted. When users ask it questions in those domains, the system silently reroutes their queries to Claude Opus 4.8, an older, weaker model already available to the public .

The same day, Anthropic also released Mythos 5 — the full-strength version — but only to organizations vetted through Project Glasswing, a defensive cybersecurity coalition the company launched in April . The result is a two-tier system: the general public gets a neutered version of what may be the most capable AI model ever built, while a curated group of corporations and government agencies gets the real thing.

The split raises a set of questions that extend well beyond Anthropic's product decisions. Who decides which capabilities are too dangerous, and for whom? Does stripping a model of specific skills actually prevent misuse, or does it primarily handicap legitimate defenders? And when adversarial nation-states face no comparable restrictions on their own AI programs, does a U.S. company's self-imposed restraint change the global threat calculus — or just tilt it?

What Mythos Can Do — and What Fable 5 Cannot

The gap between Mythos and its public counterpart is not subtle. According to Anthropic's own system card, Mythos Preview autonomously discovered zero-day vulnerabilities — previously unknown security flaws — in "every major operating system and every major web browser" . It identified thousands of high- and critical-severity bugs, including a 27-year-old flaw in OpenBSD and a 16-year-old vulnerability in FFmpeg . An engineer with no formal security training directed Mythos to find a remote code execution vulnerability "overnight" and received a complete, working exploit by morning .

On Firefox 147's JavaScript engine alone, Mythos produced 181 working exploits, compared to two from Anthropic's previous best model, Opus 4.6 . On the CyberGym benchmark, Mythos scored 83.1% versus Opus 4.6's 66.6% . On the OSS-Fuzz framework, Mythos achieved 10 tier-5 results — full control-flow hijacks — where previous models scored zero .

AI Model Exploit Proficiency Scores

Source: Anthropic System Card, red.anthropic.com

Data as of Jun 9, 2026CSV

Fable 5 retains Mythos's general reasoning, coding, and language capabilities. But on cybersecurity tasks, it is operating through Opus 4.8, which scores roughly 5 out of 16 on end-to-end exploit proficiency compared to Mythos's approximate 10 . Without guardrails, the underlying model can reproduce about 80% of known flaws; with safeguards active, that drops to 1% . Anthropic says the classifier triggering the reroute fires in fewer than 5% of sessions, but acknowledged that "sometimes benign requests will trigger our classifiers" .

The capabilities removed are primarily offensive: autonomous vulnerability discovery, exploit generation and chaining, penetration testing workflows, and binary analysis. Defensive capabilities — like code review for common bugs or security architecture advice — remain partially available through the Opus 4.8 fallback, though at reduced quality .

Who Gets the Full Model

Project Glasswing launched on April 7, 2026, with 12 founding members: AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks . By early June, the roster had grown to about 50 organizations. On the same day Fable 5 went public, Anthropic announced an expansion to 200 partner organizations across more than 15 countries .

Project Glasswing Partner Organizations

Source: Anthropic / SiliconANGLE

Data as of Jun 9, 2026CSV

Access costs $25 per million input tokens and $125 per million output tokens — roughly 2.5 times the price of Fable 5 at $10/$50 . Anthropic committed $100 million in usage credits for the program, plus $4 million in direct donations to open-source security organizations, including $2.5 million to the OpenSSF's Alpha-Omega project and $1.5 million to the Apache Software Foundation .

Joining requires meeting unspecified "security requirements" through deployment platforms like AWS, Google Cloud, or Microsoft Foundry . For individual security professionals, Anthropic announced an "upcoming Cyber Verification Program" that would let penetration testers and researchers apply for access, though the program's launch date and criteria remain undefined .

The U.S. government is a separate track. The NSA has been testing Mythos for cyber operations since at least April 2026, according to Axios, even as the Defense Department classified Anthropic as a "supply chain risk" in March . Mythos 5 is also available to federal agencies through Glasswing .

How This Compares to Other AI Companies

Anthropic is not the only company gating cybersecurity capabilities, but its approach is the most restrictive. OpenAI released GPT-5.5 in April 2026 with a parallel program called "Trusted Access for Cyber," which lowers safety classifier thresholds for vetted cybersecurity professionals rather than routing queries to a weaker model . Approved users get reduced refusals for vulnerability identification, malware analysis, reverse engineering, and detection engineering, while safeguards continue blocking credential theft, malware deployment, and exploitation of third-party systems .

Meta takes a different approach entirely, grounding its safety thresholds in outcomes — like whether a model can automate attacks on hardened networks or discover zero-days at scale — rather than restricting specific capability categories . Google DeepMind has published risk frameworks but has not publicly restricted cybersecurity capabilities in its Gemini models to the same degree .

The distinction matters. OpenAI's model lets vetted professionals use the same model with relaxed guardrails. Anthropic's model gives the public a fundamentally different, weaker system for security tasks. Whether that represents greater caution or unnecessary restriction depends on whom you ask.

The Quantitative Case for Removal

Anthropic's justification rests on its system card and internal red-teaming data. Mythos Preview was assessed at CB-1 level uplift for chemical and biological capabilities — meaning it provides "meaningful assistance to someone with basic technical knowledge" pursuing harm . In a virology protocol trial, PhD-level biologists using Mythos produced protocols with 4.3 critical failures on average, compared to 6.6 with Opus 4.6 — a measurable improvement in creating dangerous protocols .

On cybersecurity, the data is starker. The cost of discovering a vulnerability using Mythos runs under $20,000 for approximately 1,000 automated runs on OpenBSD, and around $10,000 for several hundred runs on FFmpeg . Those numbers mean a moderately resourced attacker — not a nation-state, but a criminal group or small team — could run industrialized vulnerability discovery campaigns at a fraction of what manual bug-hunting costs.

Anthropic's blog post accompanying the Fable 5 release stated: "The uplift from Mythos-level capabilities is valuable to many adversaries...we therefore expect them to be motivated" . The company did not publish granular breakdowns of uplift by attacker skill level — novice, intermediate, or expert — but the system card's description of a non-expert engineer receiving a working exploit overnight suggests the floor for effective use is low .

The Case That Removal Is Security Theater

Critics raise a structural objection: if the model weights encode cybersecurity knowledge — and they do, since Fable 5 is built on the same architecture — then behavioral fine-tuning is a surface-level restriction that may not survive determined adversarial pressure.

Anthropic says its internal and external red teams spent more than 1,000 hours attempting to bypass Fable 5's guardrails without discovering a "universal jailbreak" . But the company does not specify whether partial bypasses were found, and cybersecurity researchers have consistently broken through similar restrictions on older models . The history of AI safety filters — from early ChatGPT jailbreaks to persistent prompt injection attacks — suggests that behavioral guardrails degrade over time as attackers iterate.

There is also the question of what happens when Fable 5's weights are extracted or reproduced. While Anthropic controls API access, the knowledge embedded in a 10-trillion-parameter model cannot be recalled once the model is deployed. If the weights leak — and a Bloomberg report from April 2026 documented unauthorized access to Mythos through a third-party vendor environment — the guardrails become irrelevant.

Anthropic has not published ablation data showing that the removed capabilities cannot be recovered through fine-tuning on consumer hardware. Given that the underlying model clearly possesses the knowledge — it was trained on it — the question is whether Anthropic's safety layer is robust enough to withstand the global adversarial community's collective effort to circumvent it.

The Defender's Dilemma

The AI cybersecurity market is projected to reach $44.24 billion in 2026, growing to $213.17 billion by 2034 . Over 514,000 cybersecurity job postings exist in the United States alone, with a global talent gap of millions of unfilled positions . Many of these professionals increasingly depend on AI-assisted tooling for vulnerability assessment, penetration testing, and threat analysis.

Research Publications on "AI cybersecurity vulnerability"

Source: OpenAlex

Data as of Jan 1, 2026CSV

Academic research on AI cybersecurity vulnerabilities has exploded, with over 59,500 papers published to date and 22,065 in 2025 alone . The field is growing because the work is real: organizations need AI tools that can keep pace with AI-generated threats.

For the roughly 4,600 penetration testers with open roles in the U.S. , the capability gap between Fable 5 and Mythos is not abstract. Fable 5's rerouting to Opus 4.8 for security queries means these professionals get a tool that reproduces 1% of known flaws instead of 80% . The forthcoming Cyber Verification Program may eventually restore access for some, but its timeline and criteria are unknown.

The Centre for Emerging Technology and Security at the Alan Turing Institute framed the asymmetry directly: "Offense is operating at machine speed and scale, while defense is still paging analysts during incidents" . If the most capable defensive AI tools are restricted to 200 organizations while attackers face no such bottleneck, the restriction may widen the gap it aims to close.

The Nation-State Question

Anthropic's system card flags near-term risk from China, Iran, North Korea, and Russia, noting that "it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely" . But that framing assumes proliferation is a future event. Multiple intelligence assessments and open-source reporting indicate that state-sponsored cyber programs in China and Russia already operate sophisticated AI-augmented tooling for vulnerability research and exploit development .

The International AI Safety Report 2026 found that the best frontier model completed "nearly six times more attack steps on a realistic simulated enterprise attack than the best model eighteen months earlier, and a full attempt now costs around £65" . If adversary states are developing or have already developed equivalent capabilities independently, removing them from a public U.S. model does not change the global threat landscape — it changes who among Western defenders can access the tools.

The White House reportedly blocked a plan to expand Glasswing access to about 70 additional companies , keeping the circle small even as the threat scales. Restricted access leaves mid-size companies, startups, universities, and non-U.S. allies in a position where they cannot use the most effective defensive tools while potential attackers face no comparable constraint.

The Regulatory Vacuum

No existing legal framework clearly governs what Anthropic has done. The Export Administration Regulations (EAR) and International Traffic in Arms Regulations (ITAR) were designed for discrete transfers of static information between known parties — not for AI systems that "generate unlimited, dynamic outputs on demand for potentially anonymous users worldwide" .

The Bureau of Industry and Security issued an interim final rule in January 2025 targeting advanced AI chips and closed-weight model exports, but it focuses on hardware and model weights rather than specific model capabilities . The EU AI Act classifies general-purpose AI systems by risk level but does not specifically address the selective removal of cybersecurity capabilities from a frontier model.

A May 2026 analysis in the Journal of Law and Cyber Warfare argued that the "primary legal challenge lies in distinguishing legitimate cybersecurity research from unlawful exploitation," and that current frameworks are inadequate for governing AI systems with dual-use potential . Anthropic is, in effect, making governance decisions that no legislature has yet codified — setting its own thresholds for what constitutes acceptable risk and who qualifies as a trusted user.

What the Timeline Reveals

The sequence of events matters. Anthropic announced Mythos Preview on April 7 and declared it too dangerous for public release. By April 21, Bloomberg reported an unauthorized access incident through a third-party vendor . The NSA began testing Mythos by mid-April despite the DoD's supply-chain risk designation . Through May and June, Glasswing expanded from 50 to 200 organizations.

Then on June 9, Anthropic released Fable 5 publicly — with Mythos 5 going to Glasswing partners on the same day . The company also began red-teaming Claude Oceanus, the next model in the Mythos line, with testers reportedly gaining access around June 3 .

This timeline suggests the public release was not a last-minute safety decision but a planned rollout: build the restricted-access program first, establish the safety narrative, then ship the neutered public version alongside expanded restricted access. The decision appears driven by a combination of genuine safety concerns, competitive pressure from OpenAI's parallel moves , and the commercial need to bring Mythos-class capabilities to market in some form.

What Comes Next

Fable 5 is priced at $10/$50 per million input/output tokens — double Opus — and is available on the Claude API, AWS Bedrock, Google Cloud Vertex AI, and Microsoft Foundry . It is included on Pro, Max, Team, and Enterprise plans until June 22, after which usage credits may be required .

The Cyber Verification Program, when it launches, will determine whether Anthropic's two-tier model becomes a temporary bridge or a permanent divide. If the vetting process is narrow and slow, a large segment of the cybersecurity workforce will be stuck with tools that are, by Anthropic's own metrics, a fraction as capable as what the model can actually do. If it is broad, the restrictions lose much of their point.

Anthropic has made a bet: that the risks of releasing Mythos's full capabilities outweigh the costs of restricting them. The evidence supports the risk assessment — an AI that produces 181 working Firefox exploits and finds decades-old vulnerabilities in major operating systems is a qualitatively different tool than what came before. Whether the chosen remedy — behavioral guardrails on a public model, restricted access for the real one — matches the scale of that risk is a question the company, its regulators, and the security community will be answering for years.

Sources (19)

[1]
Anthropic Releases Mythos-Like Model Without Cyber Capabilitiesbloomberg.com
Anthropic is widely releasing a version of Mythos that will be blocked from carrying out cybersecurity tasks, months after warning the model could spot and exploit vulnerabilities in critical software.
[2]
Anthropic's new model is Mythos on a leashcyberscoop.com
Fable 5's responses for cybersecurity and biology topics are rerouted to Claude Opus 4.8. Without guardrails, the model reproduces ~80% of known flaws; with safeguards, only 1%.
[3]
Project Glasswing: Securing critical software for the AI eraanthropic.com
Anthropic launched Project Glasswing with $100M in credits and a partner roster including AWS, Apple, Google, Microsoft, and others to use Mythos defensively.
[4]
Claude Mythos Preview System Cardred.anthropic.com
Mythos Preview found vulnerabilities in every major operating system and web browser. On Firefox 147, it produced 181 working exploits vs. 2 from Opus 4.6.
[5]
Project Glasswinganthropic.com
Claude Mythos Preview rated more than a quarter of vulnerabilities as high-severity or critical, and manual analysis confirmed 90.6% qualified.
[6]
Anthropic expands Project Glasswing cybersecurity program to 150 more organizationssiliconangle.com
Anthropic opened access to 150 more organizations across more than 15 countries, expanding the program to sectors including healthcare, energy, and communications.
[7]
Claude Fable 5 Is Here: Anthropic's First Public Mythos-Class Modelpasqualepillitteri.it
Fable 5 priced at $10/$50 per million tokens, available on Claude API, AWS Bedrock, Google Cloud Vertex AI, and Microsoft Foundry. Included on Pro plans until June 22.
[8]
NSA using Anthropic's Mythos despite Defense Department blacklistaxios.com
NSA is readying Anthropic's Mythos for cyber operations even as the Defense Department designated the company a 'supply chain risk' in March 2026.
[9]
Scaling Trusted Access for Cyber with GPT-5.5 and GPT-5.5-Cyberopenai.com
OpenAI lowers classifier-based refusals for vetted cybersecurity professionals rather than routing to a weaker model, enabling authorized security workflows.
[10]
Unauthorized Group Gains Access to Anthropic's Exclusive Cyber Tool Mythoscybersecuritynews.com
Bloomberg reported on April 21, 2026, that a small group of unauthorized users gained access to Mythos through a third-party vendor environment.
[11]
AI in Cybersecurity Market Report 2026-2031marketsandmarkets.com
The global AI in cybersecurity market is projected to grow from $44.24 billion in 2026 to $213.17 billion by 2034 at a CAGR of 21.71%.
[12]
Cybersecurity Jobs in 2026penligent.ai
Over 4,600 open penetration and vulnerability tester roles in the U.S., with total cybersecurity postings exceeding 514,000 nationwide.
[13]
OpenAlex: AI Cybersecurity Vulnerability Research Publicationsopenalex.org
Over 59,500 academic papers published on AI cybersecurity vulnerabilities, with 22,065 in 2025 alone.
[14]
Claude Mythos: What Does Anthropic's New Model Mean for the Future of Cybersecurity?cetas.turing.ac.uk
Offense is operating at machine speed and scale, while defense is still paging analysts during incidents. The best frontier model completed nearly six times more attack steps than 18 months earlier.
[15]
Anthropic's Mythos and the global cybersecurity gaprestofworld.org
The White House blocked a plan to expand Glasswing access to about 70 additional companies, keeping the circle small even as threats scale.
[16]
AI Model Outputs Demand the Attention of Export Control Agenciesjustsecurity.org
Frontier AI models can generate ITAR- and EAR-controlled information, but these frameworks are ill-suited for AI systems that generate dynamic outputs for anonymous users worldwide.
[17]
New Export Control Rule Regulates Global Diffusion of AIjonesday.com
BIS issued an interim final rule in January 2025 targeting advanced AI chips and closed-weight model exports with destination, volume, and end-user controls.
[18]
Regulating Dual Use AI in Cyber Operationsjlcw.org
The primary legal challenge lies in distinguishing legitimate cybersecurity research from unlawful exploitation. Current frameworks are inadequate for dual-use AI.
[19]
Anthropic Mythos / Oceanus Rumor: Red Teaming, Pricing Speculationknightli.com
A model identifier claude-oceanus-v1-p appeared on June 3 in the Claude Console, with red teamers reportedly granted access the same day.

Anthropic Releases Mythos AI Model to Public with Cybersecurity Capabilities Removed

What Mythos Can Do — and What Fable 5 Cannot

Who Gets the Full Model

How This Compares to Other AI Companies

The Quantitative Case for Removal

The Case That Removal Is Security Theater

The Defender's Dilemma

The Nation-State Question

The Regulatory Vacuum

What the Timeline Reveals

What Comes Next

Related Stories

Anthropic and OpenAI Move to Restrict Access to Their Latest AI Models

Anthropic Releases New AI Model 'Mythos,' Raising Safety Questions

Anthropic Launches Project Glasswing to Counter AI-Enabled Cyberattacks

Anthropic Investigates Reported Unauthorized Access to Internal Mythos AI Tool

Federal AI Safety Institute Signs National Security Testing Agreements with Google, Microsoft, and xAI

Sources (19)