Technology

GPT-4 autonomously hacks zero-day security flaws with 53% success rate

GPT-4 autonomously hacks zero-day security flaws with 53% success rate
An unattended laptop, rendered using AI tools, quietly works away autonomously
An unattended laptop, rendered using AI tools, quietly works away autonomously
View 1 Image
An unattended laptop, rendered using AI tools, quietly works away autonomously
1/1
An unattended laptop, rendered using AI tools, quietly works away autonomously

Researchers were able to successfully hack into more than half their test websites using autonomous teams of GPT-4 bots, co-ordinating their efforts and spawning new bots at will. And this was using previously-unknown, real-world 'zero day' exploits.

A couple of months ago, a team of researchers released a paper saying they'd been able to use GPT-4 to autonomously hack one-day (or N-day) vulnerabilities – these are security flaws that are already known, but for which a fix hasn't yet been released. If given the Common Vulnerabilities and Exposures (CVE) list, GPT-4 was able to exploit 87% of critical-severity CVEs on its own.

Skip forward to this week and the same group of researchers released a follow-up paper saying they've been able to hack zero-day vulnerabilities – vulnerabilities that aren't yet known – with a team of autonomous, self-propagating Large Language Model (LLM) agents using a Hierarchical Planning with Task-Specific Agents (HPTSA) method.

Instead of assigning a single LLM agent trying to solve many complex tasks, HPTSA uses a "planning agent" that oversees the entire process and launches multiple "subagents," that are task-specific. Very much like a boss and his subordinates, the planning agent coordinates to the managing agent which delegates all efforts of each "expert subagent", reducing the load of a single agent on a task it might struggle with.

It's a technique similar to what Cognition Labs uses with its Devin AI software development team; it plans a job out, figures out what kinds of workers it'll need, then project-manages the job to completion while spawning its own specialist 'employees' to handle tasks as needed.

AI Teamwork

When benchmarked against 15 real-world web-focused vulnerabilities, HPTSA has shown to be 550% more efficient than a single LLM in exploiting vulnerabilities and was able to hack 8 of 15 zero-day vulnerabilities. The solo LLM effort was able to hack only 3 of the 15 vulnerabilities.

Blackhat or whitehat? There is legitimate concern that these models will allow users to maliciously attack websites and networks. Daniel Kang – one of the researchers and the author of the white paper – noted specifically that in chatbot mode, GPT-4 is "insufficient for understanding LLM capabilities" and is unable to hack anything on its own.

That's good news, at least.

When I asked ChatGPT if it could exploit zero-days for me, it replied "No, I am not capable of exploiting zero-day vulnerabilities. My purpose is to provide information and assistance within ethical and legal boundaries," and suggested that I consult a cybersecurity professional instead.

Source: Cornell University arxiv

4 comments
4 comments
Whatsis?
Uh oh.
This is a big dead canary and we're only about five feet past the entrance to the mine.
lonegray
Zero day (0day) vulnerabilities are just vulnerabilities which do not yet have a patch for them. e.g., zero days since a patch was released. They do discuss 1 and N day's ... which are vulnerabilities for which a patch has been released 1 day or N days, etc. However, highly suspicious of this article as it provides no substance. e.g., in a typical scenario there would be enough information provided to reproduce or try and reproduce what the author did or a reference to a detailed white paper, etc.

There are several philosophical razors for this type of thing. 2 of which are:
1 Alder's razor - That which cannot be explained by experimentation or evidence is not worthy of debate.
2 Sagan Standard - Extraordinary claims require extra ordinary evidence.
Brian M
@lonegray
There is nothing particularly shocking or unexpected about the claims for the use of AI (more likely general AI eventually) it was only a mater of time.
Publishing the exact methodology to reproduce would be, well just irresponsible in an unrestricted article.

As 'Whatsis?' comments 'This is a big dead canary and we're only about five feet past the entrance to the mine',
and I would add, but we knew the gas was there in the first place so could have spared the canary!
michael_dowling
lonegray: There is only one defense against zero day malware that is air tight- a sandboxing application that runs your browser/email client in a isolated environment that can be easily erased by closing the sandbox,typically every 12 hours. Business versions of Windows 11 come with a built in sandbox,I have read.