Technology

GPT-4 autonomously hacks zero-day security flaws with 53% success rate

An unattended laptop, rendered using AI tools, quietly works away autonomously
An unattended laptop, rendered using AI tools, quietly works away autonomously

Researchers were able to successfully hack into more than half their test websites using autonomous teams of GPT-4 bots, co-ordinating their efforts and spawning new bots at will. And this was using previously-unknown, real-world 'zero day' exploits.

A couple of months ago, a team of researchers released a paper saying they'd been able to use GPT-4 to autonomously hack one-day (or N-day) vulnerabilities – these are security flaws that are already known, but for which a fix hasn't yet been released. If given the Common Vulnerabilities and Exposures (CVE) list, GPT-4 was able to exploit 87% of critical-severity CVEs on its own.

Skip forward to this week and the same group of researchers released a follow-up paper saying they've been able to hack zero-day vulnerabilities – vulnerabilities that aren't yet known – with a team of autonomous, self-propagating Large Language Model (LLM) agents using a Hierarchical Planning with Task-Specific Agents (HPTSA) method.

Instead of assigning a single LLM agent trying to solve many complex tasks, HPTSA uses a "planning agent" that oversees the entire process and launches multiple "subagents," that are task-specific. Very much like a boss and his subordinates, the planning agent coordinates to the managing agent which delegates all efforts of each "expert subagent", reducing the load of a single agent on a task it might struggle with.

It's a technique similar to what Cognition Labs uses with its Devin AI software development team; it plans a job out, figures out what kinds of workers it'll need, then project-manages the job to completion while spawning its own specialist 'employees' to handle tasks as needed.

AI Teamwork

When benchmarked against 15 real-world web-focused vulnerabilities, HPTSA has shown to be 550% more efficient than a single LLM in exploiting vulnerabilities and was able to hack 8 of 15 zero-day vulnerabilities. The solo LLM effort was able to hack only 3 of the 15 vulnerabilities.

Blackhat or whitehat? There is legitimate concern that these models will allow users to maliciously attack websites and networks. Daniel Kang – one of the researchers and the author of the white paper – noted specifically that in chatbot mode, GPT-4 is "insufficient for understanding LLM capabilities" and is unable to hack anything on its own.

That's good news, at least.

When I asked ChatGPT if it could exploit zero-days for me, it replied "No, I am not capable of exploiting zero-day vulnerabilities. My purpose is to provide information and assistance within ethical and legal boundaries," and suggested that I consult a cybersecurity professional instead.

Source: Cornell University arxiv

  • Facebook
  • Twitter
  • Flipboard
  • LinkedIn
4 comments
Whatsis?
Uh oh.
This is a big dead canary and we're only about five feet past the entrance to the mine.
lonegray
Zero day (0day) vulnerabilities are just vulnerabilities which do not yet have a patch for them. e.g., zero days since a patch was released. They do discuss 1 and N day's ... which are vulnerabilities for which a patch has been released 1 day or N days, etc. However, highly suspicious of this article as it provides no substance. e.g., in a typical scenario there would be enough information provided to reproduce or try and reproduce what the author did or a reference to a detailed white paper, etc.

There are several philosophical razors for this type of thing. 2 of which are:
1 Alder's razor - That which cannot be explained by experimentation or evidence is not worthy of debate.
2 Sagan Standard - Extraordinary claims require extra ordinary evidence.
Brian M
@lonegray
There is nothing particularly shocking or unexpected about the claims for the use of AI (more likely general AI eventually) it was only a mater of time.
Publishing the exact methodology to reproduce would be, well just irresponsible in an unrestricted article.

As 'Whatsis?' comments 'This is a big dead canary and we're only about five feet past the entrance to the mine',
and I would add, but we knew the gas was there in the first place so could have spared the canary!
michael_dowling
lonegray: There is only one defense against zero day malware that is air tight- a sandboxing application that runs your browser/email client in a isolated environment that can be easily erased by closing the sandbox,typically every 12 hours. Business versions of Windows 11 come with a built in sandbox,I have read.