In early April, the renowned AI company, Anthropic, unveiled its latest and most powerful AI model, Claude Mythos.
Also known as Project Glasswing, it was introduced as a cybersecurity revolution, supposedly far exceeding all but “the most skilled humans.” In their comprehensive 200-page manifesto explaining what the AI was all about, Anthropic claimed that Claude Mythos “has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser.”
Although the concept of an AI that can hack practically anything may be ever so slightly disturbing, Anthropic actually acknowledges the risks it would pose if it fell into the wrong hands in the very same document: “The fallout—for economies, public safety, and national security—could be severe.” Thus, it is not being released into the public until further notice.
Part of their commitment is to ensure, at all costs, that Mythos is used in an ethical manner, with Anthropic promising that all developments will be shared with the industry to develop a safer, more secure internet.
They’ve shared it with a few dozen companies to ensure that they’ve got a cybersecurity head start compared to other companies, protecting their assets from up-and-coming AI hackers.
All of this only adds to the extreme irony that is the fact that a copy of Claude Mythos was leaked by a bunch of 20-year-olds in a Discord chat.
A member of the group, working for a third-party company partnered with Anthropic, used his access to the company and information revealed during March’s hack of Mercor, an AI hiring platform, to figure out where the AI was being stored.
Essentially, the hackers were able to guess the website’s URL, and from there, got ahold of Glasswing.
Although they’ve yet to do anything malicious with Mythos, reportedly just using it to make websites, had it fallen to someone with ill intent, it’d be a cybersecurity emergency.
Anthropic has said themselves how disastrous it would be if Glasswing were to reach the public. No longer would you need to be a tech expert to breach a company’s security, with AI agents like Mythos opening up the possibility to execute cyberattacks to a very, very vast audience.
Additionally, AI can work in multiple places at once, as showcased in a study by Stanford University. Every time it spotted something noteworthy, it would essentially duplicate itself to investigate, while the main agent continued to search for vulnerabilities.
This is without mentioning the obvious: unlike humans, AI doesn’t need to sleep, eat, take breaks, etc. At points where a person would have to give up, an AI agent can keep pushing.
And it does all of these things for $18 an hour, in the case of Stanford’s agent.
The amount of money lost through cybercrimes has already been rising; 2024 saw a 33% increase in revenue lost through online theft compared to previous years. Opening up an AI to the public that can commit said crimes, but can hack faster than humans can patch the holes, can only lead to disaster.
All of this is to say, AI’s role in cybersecurity has thus far been fast-tracked, and Anthropic’s security, or lack thereof, is certainly not helping.
Claude Mythos illustrates both the enormous promise and the extreme dangers posed by advanced AI in cybersecurity. While Anthropic likely means well with their mission to revolutionize cybersecurity using AI technology, the leak serves as a warning of just how vulnerable even the most secure systems can be. Had it fallen into the wrong hands, who knows what the fallout could’ve been?
































