P4WN3DGPT: An Exploratory Review of the Current and Prospective Impact of AI on Offensive Information Security Campaigns.

12–18 minutes

·

·

Photo Source: Shutterstock

By Nicolas I. Isamit

1. Introduction

“If cybercrime were a legitimate industry, it would be the third-largest economy in the world.”

These were the words that, in April 2025, John Fokker, former investigator at the Dutch National Police High-Tech Crime Unit and current Head of Threat Intelligence at Trellix, claimed at the RSAC Conference in San Francisco.[1] His remarks were aligned with what other vendors have stated about the subject. Back in 2020, cybercrime was predicted to cost the world $10.5 trillion USD annually by 2025. Now, the prediction has escalated further to $12.2 trillion USD by 2031.[2] In other words, cybercrime would only be surpassed by the United States and China.

With more than $2 billion USD in reported payments from ransomware incidents from 2022 to 2024, it is no surprise that the cybercrime industry is one of the most profitable out there.[3] As a result, threat actors have numerous incentives to exploit recent advances in generative AI to discover zero-day vulnerabilities or facilitate various steps in a kill chain. In 2023, an evil twin of ChatGPT, WormGPT 3, was released and advertised as an AI for criminals.[4] It was shut down the same year, but not long after that, KawaiiGPT, GhostGPT, and other dark LLMs joined WormGPT 4 in its return to business.[5]

For cybercriminals, dark LLMs became an alternative to traditional LLMs, which imposed guardrails designed to prevent the improper use of artificial intelligence tools. However, over time, as prompt injections bypassed system prompts, jailbreaking models like ChatGPT, Claude, or Gemini transformed traditional models into accessible hackers’ tools for phishing, ransomware, or even social engineering. In the past, coding knowledge, along with experience, was essential to becoming a successful hacker. Nowadays, with the arrival of AI, that has now changed, and the entry barrier has been lowered.

In fact, in September 2025, Anthropic detected suspicious activity that later proved to be part of an espionage campaign. According to them, the operation was orchestrated by Chinese state actors who used Claude LLM as the executor of cyberattacks.[6] For the first time in the history of cybersecurity, an autonomous attack by an AI had taken place with almost no human supervision and performed at least 80% of the attack by itself.[7] In December 2025, Anthropic’s CEO was asked to testify before the US Congress Homeland Security Committee regarding the incident.[8]

A month before, Cyber News’ security researchers tested ChatGPT, Gemini, and Claude with adversarial prompts. Using evasion strategies such as roleplaying and setting false premises, they tried out the models to override their system prompts regarding censored categories such as crime, self-harm, and hate speech.[9] To an extent, all three models complied when asked questions framed as part of academic studies, used the third person, or had poor grammar. If anything, the research proved that large language models could still leak potentially harmful information, regardless of safety restrictions. 

To put it differently, artificial intelligence has affected the way in which information security has been approached until now. It has made hunting for bugs and developing exploits easier, but it has also empowered the tactics to exploit the ultimate vulnerability of every security scheme: humans. According to security experts, it has never been this easy to get into cybercrime, maybe not even in the days before Microsoft shifted their focus and positioned security as a top priority after the Code Red worm infected hundreds of thousands of computers running Microsoft, and software had inherent vulnerabilities to take advantage of.[10]

In other words, AI changed everything.

2. Rise of AI-Augmented Techniques and Tools.

For the purposes of this article, techniques should be understood as specific methods a certain actor uses to achieve its goal, as defined in the ATT&CK framework.[11] For example, advanced persistent threats (APTs) that aim to obtain access to a network (the goal) could use a phishing attack as a technique. With the rise of AI, APTs are able to enhance their techniques, relying on artificial intelligence to refine their methods and increase efficiency.

2.1. Social Engineering: V-Phishing (Vishing) & AI Phishing

Phishing has been one of the techniques that has benefited the most from AI. According to CrowdStrike’s Threat Hunting Report for 2025, voice phishing (or vishing) attacks increased by 442% from the first to the second half of 2024. Most importantly, during the first half of 2025, vishing attacks had already surpassed the total number of attacks in 2024.[12] As LLMs perfect their abilities to write convincing emails and agentic AI starts to develop advanced voice capabilities, cybercriminals can turn these tools into sophisticated phishing packages that can target an entire organisation (or spearfish an individual) without needing more than one person behind the keyboard.

In 2024, OpenAI’s voice cloning models were supposedly able to work with only a 15-second sample.[13] Nowadays, those 15 seconds could be found anywhere on social media and then used to target coworkers, family or friends. As a consequence, distress calls from loved ones asking for money could easily be used against anyone. From phone calls from Joe Biden telling people not to vote, to sons asking for money for bail after a car wreck, artificial intelligence has made distinguishing an actual cry for help a challenging task to do.[14] As a technique threatening confidentiality, phishing has been made more accessible, and its entry threshold has been lowered. The dark web plays an important role in this, as security researchers have found AI-enabled phishing kits available for purchase for threat actors who do not have the time or expertise to set up everything by themselves.[15]

2.2.Vibe Coding, Dark LLMs & Prompt Injection

Vibe coding was chosen as the Word[16] Of The Year by Collins Dictionary for 2025, and it is defined as the practice of programming software with the aid of artificial intelligence, which turns natural language into computer code.[17] While not necessarily a technique, vibe coding grants coding knowledge to virtually any user willing to work on their AI prompts, whether to jailbreak a model or to obtain exactly what they need.

Most of the large language models follow a strict set of guidelines that prevent their misuse. For instance, ChatGPT will initially reject a petition of instructions on how to set up a homemade bomb. However, with the launch of WormGPT back in 2023, dark LLMs began to rise as an alternative for cybercriminals, who no longer needed to know how to code.[18] As already stated, this means AI could set up entire campaigns through prompts given by the user, capable of impacting the whole kill chain.

Figure 1. ChatGPT refuses to elaborate on forbidden topics if asked directly (Source: Own elaboration.)[19]

Nonetheless, traditional LLMs, through jailbreak, can potentially deliver the same result and actively make the work of targeting the integrity of systems easier. The incident with Claude proved that highly autonomous attacks could now be executed by AI models that followed guardrails, although it was not the first time an AI model was used to enable efficiency from a red team’s perspective. In 2025, a researcher at a top cybersecurity company used DeepSeek to find two zero-day exploits in one of the world’s leading browsers.[20] Therefore, through prompt injection, threat actors could eventually find vulnerabilities and actively ask the model to develop exploits and tailored payloads, regardless of the efforts of companies such as Google, Anthropic, OpenAI and Microsoft to protect their LLMs, as the user can trick the model to blur the lines between the user and the system prompts.[21] As the golden rule states, never trust user input.

Figure 2. If the right prompt is used, LLMs will be more open to discussing prohibited topics (Source: Own elaboration.)

Figure 3. ChatGPT shares an example of how to write a keylogger. Although simple, it could prove useful for threat actors who lack coding expertise (Source: Own elaboration.)

Initiatives such as HackAPrompt try to raise awareness and mitigate this challenge through offering financial compensation to those who successfully gaslight AI models.[22] An illustrative example of how prompt injection works was developed by Lakera through its Gandalf model, an LLM designed to be jailbroken and with different levels of difficulty.[23]

Figure 4. ChatGPT delves into the techniques used by hackers to evade detection and shares a simple code to transfer captured data through Hypertext Transfer Protocol (Source: Own elaboration.)

As a matter of fact, artificial intelligence has democratised knowledge that had entry barriers before. While that can lead to considerable benefits for humanity, it also leads to the kind of challenges that usually follow closely after an initial revolution has begun. As a consequence, information security, along with hundreds of other industries, has been directly impacted by its effects.

3. Implications for Information Security.

When it comes to information security, artificial intelligence has made it possible for any potential threat actor to access valuable know-how that will suit their goals. As mentioned, during 2024 and 2025, a single threat actor infiltrated more than 300 organisations using AI-enabled techniques and procedures.[24] If the trend continues, AI will continue to scale up the efficiency of attack campaigns, and therefore, it will force blue teamers to adopt AI-enabled tools, like CrowdStrike’s Falcon or Palo Alto Networks’ Cortex AgentiX.

From mitigating AI-enabled deepfakes and phishing to zero-day exploits coded by LLMs, cybersecurity vendors will continue to have only one alternative: fight fire with fire, and use AI to enable detection at machine speed once humans can no longer keep up with the volume and velocity of AI-boosted attacks.[25] In 2024, Mandiant released an analysis of Time-to-Exploit[26] (TTE) trends during 2023 and found that the average TTE was around five days, 27 days quicker than in 2021 and 2022.[27] According to the penetration testing company DeepStrike, this decrease was exacerbated during the first half of 2025, as TTE was reduced to approximately 24 hours.[28]

Figure 5. Time-to-Exploit Trends (Source: Own elaboration based on data from DeepStrike, CyberMindr, and Google Mandiant).[29]

As stated by several threat reports for 2024 and 2025, this has caused a significant percentage of vulnerabilities to be exploited before they are even listed on the Common Vulnerabilities and Exposures database, the publicly known database that assigns them IDs.[30] Considering the undeniable rise in the use of AI to conduct attacks, it is logical to assume that, to an extent, AI has been one of the causes of the decrease in TTE trends.

While one of the arguments to mitigate the effects of AI on information security is to strengthen regulations, the current tendency inclines towards the opposite, as, in the name of innovation, regulation falls behind. In December 2025, U.S. President Donald Trump signed an executive order to block states from creating their own AI regulations.[31] The move fits into the larger geopolitical context where the US and China race as the main contestants in the AI competition, and whoever emerges as the victor will probably dictate the future of artificial intelligence, with all the advantages it brings not only in the technology sector, but also in military, economic, and social sectors.[32]

In the game of the shield and the sword, information security is probably one of the most affected fields by AI. For now, one thing is sure: AI’s impact will continue to escalate as it perfects itself as the ideal hacker companion, either state-sponsored or not. The use of Claude to carry out a complete cyberespionage campaign with almost zero human supervision could be a window through which to look into a future where generative agentic AI could be able to be completely in charge of the full kill chain if the user asks for it. This could lead to the creation of entire sci-fi autonomous AI criminal rings. In short, one way or another, AI has embedded itself in virtually every security professional’s work life out there, and it will probably continue to do so in the future.

References

‘Average Time-to-Exploit in 2025’. CyberMindr, 29 August 2025. https://www.cybermindr.com/blog/average-time-to-exploit-in-2025/.

Baron, Callie, and Piotr Wojtyla. ‘How GhostGPT Empowers Cybercriminals with Uncensored AI’. Abnormal AI, 23 January 2025. https://abnormal.ai/blog/ghostgpt-uncensored-ai-chatbot.

Bethea, Charles. ‘The Terrifying A.I. Scam That Uses Your Loved One’s Voice’. Annals of Artificial Intelligence. The New Yorker, 7 March 2024. https://www.newyorker.com/science/annals-of-artificial-intelligence/the-terrifying-ai-scam-that-uses-your-loved-ones-voice.

Charrier, Casey, and Robert Weiner. ‘How Low Can You Go? An Analysis of 2023 Time-to-Exploit Trends’. Google Cloud Blog. https://cloud.google.com/blog/topics/threat-intelligence/time-to-exploit-trends-2023.

‘Collins – The Collins Word of the Year 2025 Is…’ https://www.collinsdictionary.com/woty.

Cress, Laura. ‘Vibe Coding’ Named Word of the Year by Collins Dictionary. 6 November 2025. https://www.bbc.com/news/articles/cpd2y053nleo.

CrowdStrike. European Threat Landscape Report. CrowdStrike, 2025.

CrowdStrike. Threat Hunting Report. 2025. https://www.crowdstrike.com/explore/2025-threat-hunt-report.

David, Emilia. ‘OpenAI’s Voice Cloning AI Model Only Needs a 15-Second Sample to Work’. The Verge, 29 March 2024. https://www.theverge.com/2024/3/29/24115701/openai-voice-generation-ai-model.

Disrupting the First Reported AI-Orchestrated Cyber Espionage Campaign. Anthropic, 2025. https://assets.anthropic.com/m/ec212e6566a0d47/original/Disrupting-the-first-reported-AI-orchestrated-cyber-espionage-campaign.pdf.

European Union Agency for Law Enforcement Cooperation, ed. European Union Serious and Organised Crime Threat Assessment: The Changing DNA of Serious and Organised Crime. Publications Office, 2025. https://doi.org/10.2813/0758057.

Fox, Taylor. ‘Cybercrime To Cost The World $12.2 Trillion Annually By 2031’. Cybercrime Magazine, 28 May 2025. https://cybersecurityventures.com/official-cybercrime-report-2025/.

‘Gandalf | Lakera – Test Your AI Hacking Skills’. https://gandalf.lakera.ai/baseline.

Garimella, Abhilash. ‘Dark Web Toolkits Using AI Fuel New Phishing Attacks’. NetworkComputing, 2024. https://www.networkcomputing.com/network-security/the-expanding-dark-web-toolkit-using-ai-to-fuel-modern-phishing-attacks.

George, Torsten. ‘Five Cybersecurity Predictions for 2026: Identity, AI, and the Collapse of Perimeter Thinking’. SecurityWeek, 17 December 2025. https://www.securityweek.com/five-cybersecurity-predictions-for-2026-identity-ai-and-the-collapse-of-perimeter-thinking/.

Glenny, Misha. ‘Cyber Crime Is Surging. Will AI Make It Worse?’ The Weekend Essay. Financial Times, 7 June 2025. https://www.ft.com/content/d3119d3f-97bd-4ff4-905d-b471a8828beb.

Greig, Jonathan. More than $2 Billion in Payments from 4,000 Ransomware Incidents Reported to Treasury in Recent Years. n.d. https://therecord.media/fincen-treasury-2-billion-ransomware-payments-report?mkt_tok=Njc4LUZITC03MTAAAAGeoVAsungk6lS43IIbm5GaGx2zZo8KqkF1ng0s-rJ6wXpMLBt6QWPK8ARQqyXZlFXbQaSrRSkMuFd5AzSbxwTXvL52op0PHTClQTxgpN5f.

HackAPrompt. ‘HackAPrompt 2.0’. https://hackaprompt.com.

Heikkilä, Melissa. ‘Tech Groups Step up Efforts to Solve AI’s Big Security Flaw’. Cyber Security. Financial Times, 2 November 2025. https://www.ft.com/content/56cb100e-7146-488f-aae5-55304ae0eff6.

Hoskins, Peter, and Lily Jamali. ‘Trump Signs Order Blocking States from Enforcing Own AI Rules’. BBC, n.d. https://www.bbc.com/news/articles/crmddnge9yro.

Khalil, Mohammed. ‘Vulnerabilities Statistics 2025: Record CVE Surge’. DeepStrike, 8 October 2025. https://deepstrike.io/blog/vulnerability-statistics-2025.

Kovacs, Eduard. ‘DeepSeek’s Malware-Generation Capabilities Put to Test’. SecurityWeek, 13 March 2025. https://www.securityweek.com/deepseeks-malware-generation-capabilities-put-to-test/.

Kovacs, Eduard. ‘WormGPT 4 and KawaiiGPT: New Dark LLMs Boost Cybercrime Automation’. SecurityWeek, 25 November 2025. https://www.securityweek.com/wormgpt-4-and-kawaiigpt-new-dark-llms-boost-cybercrime-automation/.

Perlroth, Nicole. This Is How They Tell Me The World Ends. Bloomsbury Publishing, 2021.

Sabeckis, Mantas, and Jurgita Lapienyté. ‘We Tested ChatGPT, Gemini, and Claude with Adversarial Prompts: Here Are Our Findings and Risks’. Cybernews, 13 November 2025. https://cybernews.com/security/we-tested-chatgpt-gemini-and-claude/.

Sabin, Sam. ‘Exclusive: Anthropic CEO Called to Testify before Congress about Chinese AI Cyberattack’. Axios, 26 November 2025. https://www.axios.com/2025/11/26/anthropic-google-cloud-quantum-xchange-house-homeland-hearing.

Taddeo, Mariarosaria. ‘Agentic AI Is the Hacker’s New Accomplice’. Cyber Warfare. Financial Times, 27 November 2025. https://www.ft.com/content/9966d9e8-7fd3-4324-b57e-f02763795d29.

‘Techniques – Enterprise | MITRE ATT&CK®’. https://attack.mitre.org/techniques/enterprise/.

Tidy, Joe. ‘AI Firm Claims Chinese Spies Used Its Tech to Automate Cyber Attacks’. BBC, 14 November 2025. https://www.bbc.com/news/articles/cx2lzmygr84o.


[1] Misha Glenny, ‘Cyber Crime Is Surging. Will AI Make It Worse?’, The Weekend Essay, Financial Times, 7 June 2025, https://www.ft.com/content/d3119d3f-97bd-4ff4-905d-b471a8828beb.

[2] Taylor Fox, ‘Cybercrime To Cost The World $12.2 Trillion Annually By 2031’, Cybercrime Magazine, 28 May 2025, https://cybersecurityventures.com/official-cybercrime-report-2025/.

[3] Jonathan Greig, More than $2 Billion in Payments from 4,000 Ransomware Incidents Reported to Treasury in Recent Years, n.d., https://therecord.media/fincen-treasury-2-billion-ransomware-payments-report?mkt_tok=Njc4LUZITC03MTAAAAGeoVAsungk6lS43IIbm5GaGx2zZo8KqkF1ng0s-rJ6wXpMLBt6QWPK8ARQqyXZlFXbQaSrRSkMuFd5AzSbxwTXvL52op0PHTClQTxgpN5f.

[4] Glenny, ‘Cyber Crime Is Surging. Will AI Make It Worse?’

[5] Eduard Kovacs, ‘WormGPT 4 and KawaiiGPT: New Dark LLMs Boost Cybercrime Automation’, SecurityWeek, 25 November 2025, https://www.securityweek.com/wormgpt-4-and-kawaiigpt-new-dark-llms-boost-cybercrime-automation/; Callie Baron and Piotr Wojtyla, ‘How GhostGPT Empowers Cybercriminals with Uncensored AI’, Abnormal AI, 23 January 2025, https://abnormal.ai/blog/ghostgpt-uncensored-ai-chatbot.

[6] Disrupting the First Reported AI-Orchestrated Cyber Espionage Campaign (Anthropic, 2025), https://assets.anthropic.com/m/ec212e6566a0d47/original/Disrupting-the-first-reported-AI-orchestrated-cyber-espionage-campaign.pdf.

[7] Mariarosaria Taddeo, ‘Agentic AI Is the Hacker’s New Accomplice’, Cyber Warfare, Financial Times, 27 November 2025, https://www.ft.com/content/9966d9e8-7fd3-4324-b57e-f02763795d29; Tidy, ‘AI Firm Claims Chinese Spies Used Its Tech to Automate Cyber Attacks’; Disrupting the First Reported AI-Orchestrated Cyber Espionage Campaign.

[8] Sam Sabin, ‘Exclusive: Anthropic CEO Called to Testify before Congress about Chinese AI Cyberattack’, Axios, 26 November 2025, https://www.axios.com/2025/11/26/anthropic-google-cloud-quantum-xchange-house-homeland-hearing.

[9] Mantas Sabeckis and Jurgita Lapienyté, ‘We Tested ChatGPT, Gemini, and Claude with Adversarial Prompts: Here Are Our Findings and Risks’, Cybernews, 13 November 2025, https://cybernews.com/security/we-tested-chatgpt-gemini-and-claude/.

[10] Nicole Perlroth, This Is How They Tell Me The World Ends (Bloomsbury Publishing, 2021), 36–40.

[11] ‘Techniques – Enterprise | MITRE ATT&CK®’, https://attack.mitre.org/techniques/enterprise/.

[12] CrowdStrike, European Threat Landscape Report (CrowdStrike, 2025), 4.

[13] Emilia David, ‘OpenAI’s Voice Cloning AI Model Only Needs a 15-Second Sample to Work’, The Verge, 29 March 2024, https://www.theverge.com/2024/3/29/24115701/openai-voice-generation-ai-model.

[14] Charles Bethea, ‘The Terrifying A.I. Scam That Uses Your Loved One’s Voice’, Annals of Artificial Intelligence, The New Yorker, 7 March 2024, https://www.newyorker.com/science/annals-of-artificial-intelligence/the-terrifying-ai-scam-that-uses-your-loved-ones-voice.

[15] Abhilash Garimella, ‘Dark Web Toolkits Using AI Fuel New Phishing Attacks’, NetworkComputing, 2024, https://www.networkcomputing.com/network-security/the-expanding-dark-web-toolkit-using-ai-to-fuel-modern-phishing-attacks.

[16] Even though it is made up of two words.

[17] Laura Cress, ‘Vibe Coding’ Named Word of the Year by Collins Dictionary, 6 November 2025, https://www.bbc.com/news/articles/cpd2y053nleo; ‘Collins – The Collins Word of the Year 2025 Is…’, https://www.collinsdictionary.com/woty.

[18] Eduard Kovacs, ‘DeepSeek’s Malware-Generation Capabilities Put to Test’, SecurityWeek, 13 March 2025, https://www.securityweek.com/deepseeks-malware-generation-capabilities-put-to-test/.

[19] Given the nature of the topics, ChatGPT will not allow the sharing of the link to conversations regarding these matters.

[20] Glenny, ‘Cyber Crime Is Surging. Will AI Make It Worse?’

[21] Melissa Heikkilä, ‘Tech Groups Step up Efforts to Solve AI’s Big Security Flaw’, Cyber Security, Financial Times, 2 November 2025, https://www.ft.com/content/56cb100e-7146-488f-aae5-55304ae0eff6.

[22] ‘HackAPrompt 2.0’, HackAPrompt, https://hackaprompt.com.

[23] ‘Gandalf | Lakera – Test Your AI Hacking Skills’, https://gandalf.lakera.ai/baseline.

[24] CrowdStrike, Threat Hunting Report, 15–19.

[25] Torsten George, ‘Five Cybersecurity Predictions for 2026: Identity, AI, and the Collapse of Perimeter Thinking’, SecurityWeek, 17 December 2025, https://www.securityweek.com/five-cybersecurity-predictions-for-2026-identity-ai-and-the-collapse-of-perimeter-thinking/.

[26] TTE refers to the time it takes a vulnerability to be exploited before or after a patch is released.

[27] Casey Charrier and Robert Weiner, ‘How Low Can You Go? An Analysis of 2023 Time-to-Exploit Trends’, Google Cloud Blog, https://cloud.google.com/blog/topics/threat-intelligence/time-to-exploit-trends-2023.

[28] Mohammed Khalil, ‘Vulnerabilities Statistics 2025: Record CVE Surge’, DeepStrike, 8 October 2025, https://deepstrike.io/blog/vulnerability-statistics-2025.

[29] Khalil, ‘Vulnerabilities Statistics 2025’; ‘Average Time-to-Exploit in 2025’, CyberMindr, 29 August 2025, https://www.cybermindr.com/blog/average-time-to-exploit-in-2025/; Charrier and Weiner, ‘How Low Can You Go?’

[30] ‘Average Time-to-Exploit in 2025’.

[31] Peter Hoskins and Lily Jamali, ‘Trump Signs Order Blocking States from Enforcing Own AI Rules’, BBC, n.d., https://www.bbc.com/news/articles/crmddnge9yro.

[32] Glenny, ‘Cyber Crime Is Surging. Will AI Make It Worse?’