GPT-4 autonomously hacks zero-day security flaws with 53% success rate

By Joe Salas

June 08, 2024

An unattended laptop, rendered using AI tools, quietly works away autonomously

View 1 Image

1/1

An unattended laptop, rendered using AI tools, quietly works away autonomously

Researchers were able to successfully hack into more than half their test websites using autonomous teams of GPT-4 bots, co-ordinating their efforts and spawning new bots at will. And this was using previously-unknown, real-world 'zero day' exploits.

A couple of months ago, a team of researchers released a paper saying they'd been able to use GPT-4 to autonomously hack one-day (or N-day) vulnerabilities – these are security flaws that are already known, but for which a fix hasn't yet been released. If given the Common Vulnerabilities and Exposures (CVE) list, GPT-4 was able to exploit 87% of critical-severity CVEs on its own.

Skip forward to this week and the same group of researchers released a follow-up paper saying they've been able to hack zero-day vulnerabilities – vulnerabilities that aren't yet known – with a team of autonomous, self-propagating Large Language Model (LLM) agents using a Hierarchical Planning with Task-Specific Agents (HPTSA) method.

Instead of assigning a single LLM agent trying to solve many complex tasks, HPTSA uses a "planning agent" that oversees the entire process and launches multiple "subagents," that are task-specific. Very much like a boss and his subordinates, the planning agent coordinates to the managing agent which delegates all efforts of each "expert subagent", reducing the load of a single agent on a task it might struggle with.

It's a technique similar to what Cognition Labs uses with its Devin AI software development team; it plans a job out, figures out what kinds of workers it'll need, then project-manages the job to completion while spawning its own specialist 'employees' to handle tasks as needed.

AI Teamwork

When benchmarked against 15 real-world web-focused vulnerabilities, HPTSA has shown to be 550% more efficient than a single LLM in exploiting vulnerabilities and was able to hack 8 of 15 zero-day vulnerabilities. The solo LLM effort was able to hack only 3 of the 15 vulnerabilities.

Blackhat or whitehat? There is legitimate concern that these models will allow users to maliciously attack websites and networks. Daniel Kang – one of the researchers and the author of the white paper – noted specifically that in chatbot mode, GPT-4 is "insufficient for understanding LLM capabilities" and is unable to hack anything on its own.

That's good news, at least.

When I asked ChatGPT if it could exploit zero-days for me, it replied "No, I am not capable of exploiting zero-day vulnerabilities. My purpose is to provide information and assistance within ethical and legal boundaries," and suggested that I consult a cybersecurity professional instead.

Source: Cornell University arxiv

4 comments

Whatsis? June 9, 2024 05:15 AM

Uh oh. This is a big dead canary and we're only about five feet past the entrance to the mine.

lonegray June 9, 2024 06:41 PM

Zero day (0day) vulnerabilities are just vulnerabilities which do not yet have a patch for them. e.g., zero days since a patch was released. They do discuss 1 and N day's ... which are vulnerabilities for which a patch has been released 1 day or N days, etc. However, highly suspicious of this article as it provides no substance. e.g., in a typical scenario there would be enough information provided to reproduce or try and reproduce what the author did or a reference to a detailed white paper, etc.
There are several philosophical razors for this type of thing. 2 of which are: 1 Alder's razor - That which cannot be explained by experimentation or evidence is not worthy of debate. 2 Sagan Standard - Extraordinary claims require extra ordinary evidence.

Brian M June 10, 2024 02:43 AM

@lonegray There is nothing particularly shocking or unexpected about the claims for the use of AI (more likely general AI eventually) it was only a mater of time. Publishing the exact methodology to reproduce would be, well just irresponsible in an unrestricted article.
As 'Whatsis?' comments 'This is a big dead canary and we're only about five feet past the entrance to the mine', and I would add, but we knew the gas was there in the first place so could have spared the canary!

michael_dowling June 10, 2024 05:55 AM

lonegray: There is only one defense against zero day malware that is air tight- a sandboxing application that runs your browser/email client in a isolated environment that can be easily erased by closing the sandbox,typically every 12 hours. Business versions of Windows 11 come with a built in sandbox,I have read.

GPT-4 autonomously hacks zero-day security flaws with 53% success rate

Researchers were able to successfully hack into more than half their test websites using autonomous teams of GPT-4 bots, co-ordinating their efforts and spawning new bots at will. And this was using previously-unknown, real-world 'zero day' exploits.

AI Teamwork

Tags

Most Viewed

France runs fusion reactor for record 22 minutes

Laser-wielding device is like an anti-aircraft system for mosquitoes

World's largest deposit holds 99.999% of all gold on Earth

FREE NEWSLETTER