The speed at which AI discovers vulnerabilities has surpassed the speed at which it patches vulnerabilities.

By: blockbeats|2026/03/30 18:00:01

On March 27, an unsecured data cache at Anthropic exposed around 3000 internal files. One draft blog post revealed the upcoming new model, Mythos, which Anthropic self-rated as "far surpassing any AI model in cybersecurity capability." On the same day, CrowdStrike and Okta each plummeted 7%, while Palo Alto Networks fell by 6%.

The market's panic is not because a more powerful model has emerged. It's because the creator of this model stated that its progress on the attack side has outpaced the speed at which the defense side can keep up.

AI Cybersecurity Dominance

According to the academic benchmark CAIBench's test results, in the Cybench test simulating a real attack-defense environment, Claude Sonnet achieved a 46% success rate. The second-ranking GPT-5 was at 28%, Google's Gemini 2.5 Pro only reached 18%, and the open-source model qwen3-32B dropped even lower to 10%.

The speed at which AI discovers vulnerabilities has surpassed the speed at which it patches vulnerabilities.

While 46% may not seem high, this is the success rate of complex penetration tasks, including steps like vulnerability discovery, building exploit chains, and privilege escalation. In a more basic Base test, Claude's success rate has already hit 75%, nearing its ceiling.

The difference is not in who is slightly better but in magnitude. Claude's complex attack-defense capability is 1.6 times that of GPT-5 and 2.5 times that of Gemini. In this dimension of cybersecurity, the distribution of abilities among models is not a ladder but a gap.

Doubling in 6 Months

What's more worth dissecting isn't the horizontal gap but the vertical speed.

According to Anthropic's official data, Sonnet 3.7, released in February 2025, achieved a 35.9% success rate on Cybench (10 attempts). In the latter half of the same year, Sonnet 4.5 reached 76.5%. The Anthropic research team's conclusion is: within 6 months, the success rate doubled.

What does this speed mean? In a real-world scenario comparison: Claude Opus 4.6 was used to audit the Firefox codebase in March this year. According to InfoQ, 22 security vulnerabilities were discovered within two weeks, with 14 being high-risk. These vulnerabilities had gone undetected despite years of manual audits and millions of hours of CPU fuzz testing. Anthropic's security team previously disclosed that Claude uncovered over 500 high-risk vulnerabilities in multiple production-grade open-source projects, some of which had been present for decades.

And the industry standard timeline for traditional penetration testing is 2 to 3 weeks, and that's just for one application. According to the Verizon 2025 Data Breach Investigations Report, the median time from public disclosure of a critical vulnerability to mass exploitation by attackers is 5 days, with a median time to patch of 32 to 38 days.

The speed at which AI discovers vulnerabilities is growing exponentially, while human patching speed is linear. The difference in time is the attack window.

In the leaked Mythos draft, Anthropic wrote that this model "heralds a coming wave of models that can exploit vulnerabilities in a way far beyond the defender's efforts." Based on the publicly known capability curve, this is not an exaggeration.

The Faster the Release, the More Urgent the Warning

If you put Anthropic's actions over the past three years on a timeline, you will see a clear pattern: every time a stronger model is released, it is quickly followed by a higher level security response.

In July 2023, the White House signed a voluntary pledge, followed by the release of the first Responsible Scaling Policy (RSP v1.0) in September of the same year. In October 2024, the RSP was upgraded to v2.0, adding a threshold for biochemical weapon capabilities. In November 2025, Anthropic disclosed the GTG-1002 incident. A China-backed threat group exploited around 30 organizations using the Claude Code, with AI independently executing 80% to 90% of the tactical operations throughout the operation. This was the first documented large-scale AI-orchestrated inter-organizational espionage campaign.

In February 2026, the RSP updated to v3.0, with the simultaneous release of Claude Code Security. In the same month, the Pentagon labeled Anthropic as a "supply chain risk" because Anthropic refused to lift clauses in the contract prohibiting large-scale surveillance and fully autonomous weapons. A month later, the Mythos leak revealed that Anthropic acknowledged in the draft that this model poses "unprecedented network security risks."

The pace of capability releases is accelerating. There is a one-year gap from Claude 1 to Claude 3, and less than three months from Opus 4.5 to Opus 4.6. Security responses are also accelerating, but they are always reactive: capabilities are exploited first, and policy patches come later. The collective drop in cybersecurity stocks on March 27 is the pricing of this time delta.

A Dark Reading survey earlier this year revealed that 48% of cybersecurity professionals identified AI-powered agents as the top attack vector for 2026. Two years ago, this option was hardly at the top of the list.

Anthropic's Mythos release strategy involves providing early access to defensive organizations, "giving them a first-mover advantage." This statement itself acknowledges the asymmetry of offense and defense. If the defenders do not need a first-mover advantage, it means the attackers have not yet arrived at the doorstep.

-- Price

In-depth analysis of Hashkey's IPO financial report: the platform token HSK is cleverly classified by the official as "contract liabilities" to smooth profits, and the expectation of up to 95% "dead coins" reveals a significant misalignment between the company's compliance logic and investors' specu...

How did Micron win a trillion-dollar market value while Samsung relies on technology cycles and Hynix relies on HBM?

Chip giant Micron Technology's total market value has surpassed $100 billion. It has navigated multiple rounds of industry reshuffling by controlling manufacturing costs and is currently facing a new cycle of competition in the high-end HBM segment, mid-to-low-end market competition, and adjustments...

Dialogue with AEON co-founder Leo: The real bottleneck of the Agentic Economy is not the model, but the settlement

Committed to becoming the "Stripe" of the AI payment era.

2 years, 225 times the return? Unveiling the mysterious researcher Serenity's AI "bottleneck" investment technique

Former WSB trader Serenity has achieved a staggering 225 times return on the X platform over two years, with their original "supply chain bottleneck" theory and several classic micro-cap reverse sniper case studies attracting strong market attention.

B.AI partners with BNB Chain to launch the "Billion AI Token Subsidy" celebration, fully igniting the on-chain intelligent agent ecosystem

B.AI partners with BNB Chain to launch a hundred billion points subsidy program, with an additional special incentive of 8,000 USDT in the total prize pool, helping Web3 players access top large models with zero barriers and experience a full-stack AI financial foundation.

The trillion-dollar frenzy of selling memory, profits from buying memory are halved

The demand for computing power and storage by AI may indeed be structural, and LTA may have truly rewritten the industry rules; a trillion-dollar market value may just be the starting point.

Who can make money in the era of Agents?

The next billion users will be Agents, but the crypto world has not yet found their wallets.

From brokerages to banks, Hong Kong intensifies efforts to clean up cross-border investment account openings

Where there is a large market demand, there will be opportunities in Hong Kong.

DeFi has reached its most dangerous moment: the real vulnerabilities are not in the code

April 2026 is not just a security crisis; it is the moment when the industry's mental model completely collapses, and it is also the moment when the protocols that can survive are distinguished from those that cannot.

Morning Report | Binance launches DYOR research tool; YZi Labs launches recruitment platform YZi Talent; Vitalik states that the Ethereum Foundation will "downsize" and reduce the amount of ETH sold

Overview of Important Market Events on May 25

Insiders betting on Musk are reaping "historic returns."

SpaceX submitted its S-1 prospectus for the largest IPO in history, disclosing details of Class A shareholdings, significant losses in the AI sector, and multiple related party transactions, with an expected listing in mid-June.

Ten Thousand Characters Breakdown of On-Chain Vaults: Eight Major Tracks, Who is Rising and Who is Declining?

On one side is the collective withdrawal of lending and collateral-type vaults, while on the other side is the counter-trend growth of RWA and curation vaults. On-chain vaults are no longer a single market, but rather eight increasingly differentiated tracks. This ten-thousand-word research report t...

Behind NEAR's Doubling: 3 Major Trends Becoming the Engine of Coin Prices

AI + Privacy + Buyback.

Visa and Stripe are both working on stablecoins, but their focus is not on payments

Why do businesses still need stablecoins? What problems do stablecoins actually solve?

How Traders Keep Profits When PEPE WLD and FET Start Moving Fast Again

PEPE, WLD and FET are moving fast again as crypto volatility returns in 2026. Here’s how active traders are adapting to fast altcoin markets, reducing trading friction, and keeping more profits during high-frequency trading.

It's easy to conquer a city, but difficult to govern it: Polymarket wants to establish a presence globally but still has to bow down everywhere

How can a system born from decentralization and without permission embed regulatory frameworks based on sovereignty, licensing, and consumer protection?

Morning News | Hyperliquid launches off-chain event prediction market contracts; Strategy completes $1.5 billion debt buyback; Kelp DAO announces rsETH has fully recovered

Overview of Important Market Events on May 26

Bankless Founder: Why I Sold All My ETH

We have come a long way, and Ethereum has already achieved its deserved maximum potential market value.

Senior Public Company Financial Audit: Taking Hashkey as an Example, Discussing Which Account to Include for Exchange Issued Platform Tokens?