ChatGPT 4 can exploit 87% of one-day vulnerabilities

Since the widespread and growing use of ChatGPT and other large language models (LLMs) in recent years, cybersecurity has become a major concern. Among the many questions, cybersecurity professionals have wondered how effective these tools are in launching an attack. Cybersecurity researchers Richard Fang, Rohan Bindu, Akul Gupta, and Daniel Kang recently conducted a study to find out the answer. The conclusion: they are very effective.

ChatGPT 4 quickly exploits day-old vulnerabilities

During the study, the team used 15 one-day vulnerabilities that occurred in real life. One-day vulnerabilities refer to the time between the discovery of an issue and the creation of the patch, meaning that it is a known vulnerability. The cases included websites with vulnerabilities, container management software, and Python packages. Since all vulnerabilities came from the CVE database, they included the CVE description.

THE Master of Laws (LL.M.) The agents also had web browsing elements, a terminal, search results, file creation, and a code interpreter. Additionally, the researchers used a very detailed prompt with a total of 1,056 tokens and 91 lines of code. The prompt also included debugging and logging instructions. The prompts did not include subagents or a separate scheduling module, however.

The team quickly found that ChatGPT was able to successfully exploit day-old vulnerabilities 87% of the time. All other methods tested, including LLMs and open-source vulnerability scanners, failed to exploit any vulnerabilities. GPT-3.5 also failed to detect any vulnerabilities. According to the report, GPT-4 only failed on two vulnerabilities, both of which were very difficult to detect.

“The Iris web application is extremely difficult for an LLM agent to navigate because navigation is done via JavaScript. As a result, the agent tries to access forms/buttons without interacting with the elements needed to make them available, preventing it from doing so. The detailed description of HertzBeat is in Chinese, which may confuse the GPT-4 agent we deploy because we use English for the prompt,” the report explains.

Discover AI-powered cybersecurity solutions

ChatGPT’s success rate is still limited by CVE code

The researchers concluded that the reason for the high success rate lies in the tool’s ability to exploit complex vulnerabilities in multiple stages, launch different attack methods, create exploit codes, and manipulate non-web vulnerabilities.

The study also revealed a significant limitation of Chat GPT for vulnerability scanning. When asked to exploit a vulnerability without the CVE code, LLM was unable to perform at the same level. Without the CVE code, GPT-4 was successful only 7% of the time, which is a decrease from 80%. Because of this significant gap, the researchers took a step back and isolated how often GPT4 was able to determine the correct vulnerability, which was 33.3% of the time.

“Surprisingly, we found that the average number of actions performed with and without a CVE description differed by only 14% (24.3 actions vs. 21.3 actions). We believe this is partly due to the length of the context window, further suggesting that a scheduling mechanism and subagents could increase performance,” the researchers wrote.

The Effect of LLMs on One-Day-in-the-Future Vulnerabilities

The researchers concluded that their study demonstrated that LLMs have the ability to autonomously exploit day-old vulnerabilities, but only GPT-4 can currently achieve this goal. However, there are concerns that the capacity and functionality of LLMs will only grow in the future, making them an even more destructive and powerful tool for cybercriminals.

“Our findings demonstrate both the possibility of an emerging capability and the fact that it is harder to discover a vulnerability than it is to exploit it. Nevertheless, our findings underscore the need for the broader cybersecurity community and LLM vendors to think carefully about how to integrate LLM agents into defensive measures and how to deploy them at scale,” the report concludes.

Latest News

AI-powered cybersecurity challenges loom on the horizon – MeriTalk

Elon Musk’s XAI gets another $6 billion to advance AI innovations

Ordnance Survey: Understanding the role of AI and ethical considerations in geospatial technology

AI-powered cybersecurity challenges loom on the horizon – MeriTalk

The Rise of AI-Driven Cybersecurity: A New Era of Defense and Offense

Trends for 2025 and beyond

New urgent Gmail security warning for billions as attacks continue

AI-powered cybersecurity challenges loom on the horizon – MeriTalk

The Rise of AI-Driven Cybersecurity: A New Era of Defense and Offense

Trends for 2025 and beyond

New urgent Gmail security warning for billions as attacks continue

AI Crypto Market Cap Rises Over 25% Amid Major Developments in the Sector

Smarter AI agents are boosting the digital entertainment industry

ChatGPT and AI tools gain ground in the search market

AI is great, but agencies need to remember that in 2025 they will be in marketing

AI Crypto Market Cap Rises Over 25% Amid Major Developments in the Sector

Smarter AI agents are boosting the digital entertainment industry

ChatGPT and AI tools gain ground in the search market

AI is great, but agencies need to remember that in 2025 they will be in marketing

Sriram Krishnan, Donald Trump’s AI chief, fights to remove country caps on green cards; here’s why it’s good news for Indians

7 Google AI announcements from October

Instagram concerned about challenge of distinguishing real images from AI-generated images, Apple to launch foldable iPhone by 2026 and beyond: Consumer Tech News (Dec. 16-20) – Apple (NASDAQ: AAPL), Amazon.com (NASDAQ:AMZN)

AI is bad news for the Global South

Sriram Krishnan, Donald Trump’s AI chief, fights to remove country caps on green cards; here’s why it’s good news for Indians

7 Google AI announcements from October

Instagram concerned about challenge of distinguishing real images from AI-generated images, Apple to launch foldable iPhone by 2026 and beyond: Consumer Tech News (Dec. 16-20) – Apple (NASDAQ: AAPL), Amazon.com (NASDAQ:AMZN)

AI is bad news for the Global South

Machine learning at the Flatiron Institute

Exploring the Power of AI and ML in Smart Grids: Advances, Applications and Challenges

Unsupervised ML 17 — Future Trends in Unsupervised Machine Learning: What’s Next? | by Ayşe Kübra Kuyucu | December 2024

FrontiersMachine learning applications in search of life beyond EarthMachine learning (ML) and artificial intelligence (AI) have moved beyond niche applications to become transformative and essential tools for analyzing data….2 days

Machine learning at the Flatiron Institute

Exploring the Power of AI and ML in Smart Grids: Advances, Applications and Challenges

Unsupervised ML 17 — Future Trends in Unsupervised Machine Learning: What’s Next? | by Ayşe Kübra Kuyucu | December 2024

FrontiersMachine learning applications in search of life beyond EarthMachine learning (ML) and artificial intelligence (AI) have moved beyond niche applications to become transformative and essential tools for analyzing data….2 days

AI-powered cybersecurity challenges loom on the horizon – MeriTalk

The Rise of AI-Driven Cybersecurity: A New Era of Defense and Offense

Trends for 2025 and beyond

Latest News

Subscribe to Updates

ChatGPT 4 can exploit 87% of one-day vulnerabilities

ChatGPT 4 quickly exploits day-old vulnerabilities

ChatGPT’s success rate is still limited by CVE code

The Effect of LLMs on One-Day-in-the-Future Vulnerabilities

Related Posts

Subscribe to Updates