Five Strategies to Mitigate LLM Risks in Cybersecurity Applications

While most CISOs and CIOs have created AI PoliciesIt has become clear that greater due diligence, oversight and governance are needed for the use of AI in a cybersecurity context. Deloitte’s annual cyber threat report, 66% of organizations have experienced ransomware attacks. There has also been a 400% increase in IoT malware attacks. And in 2023 91% of organizations have had to remediate a supply chain attack affecting their code or the systems they used.

This is because long-standing cybersecurity practices that have worked in the past have not kept pace with the capabilities and threats presented by large language models (LLM). These LLMs trained on vast amounts of data can make security operations teams and the threats they’re trying to mitigate smarter. Because LLMs are different from other security tools, we need to take a different set of approaches to mitigate their risks. Some involve new security technologies. Others are proven tactics modified for LLMs. These include:

Contradictory training: As part of the development or testing process, security professionals should expose LLMs to inputs designed to test their limits and prompt the LLM to break rules or behave maliciously. This works best during the training or development phase, before the system is fully implemented. This may involve generating adversarial examples using techniques such as adding noise, creating specific misleading prompts, or using known attack patterns to simulate potential threats. That said, CISOs should have their teams (or vendors) perform adversarial attacks on an ongoing basis to ensure compliance and identify risks or gaps.

Integrating explainability:In LLMs, “explainability” now means the ability to explain why a specific outcome was proposed. This requires cybersecurity LLM providers to add a layer of explainability to their LLM-based tools; the deep neural networks used to create LLM models are in the early stages of developing this layer. Tellingly, few security LLMs today promise explainability. This is because it is very difficult to build this reliably, and even the largest and most well-resourced LLM creators struggle to do so. This lack of explainability logically leads to the next steps of mitigation.

Continuous monitoring: Implementing security control monitoring systems is not new. Asset inventories and security posture management tools attempt to do this. However, LLMs are different, and continuous monitoring must detect anomalous or unexpected results of LLMs in real-world usage. This is especially challenging when the results are unpredictable and potentially infinite. Large AI vendors like OpenAI and Anthropic deploy specific LLMs to monitor their LLMs—a spy to catch a spy, so to speak. In the future, most LLM deployments will operate in pairs—one for output and usage, the other for monitoring.

The human in the loop: Because LLMs are so novel and potentially risky, organizations need to combine LLM suggestions with human expertise for critical decision-making. However, keeping a human in the loop doesn’t completely solve the problem. Research on human decision-making when combined with AIs has shown that LLMs that appear more authoritative cause human operators to “take their hands off the wheel” and overly trust AIs. CISOs and their teams need to create a failsafe process where LLMs aren’t overly trusted or given too much responsibility, such that human operators become overly dependent and unable to discern between LLM errors and hallucinations. One option: Initially introduce LLMs in “Suggestion Only” mode, where they offer advice and guidance but aren’t allowed to make changes, share information, or otherwise interact with systems and the like without the explicit permission of their human operator.

Sandboxing and progressive deployment: It is essential to thoroughly test LLMs in isolated environments before actual deployment. While this is related to adversarial training, it is also different because we need to test the LLM in circumstances that are nearly identical to real cybersecurity processes and workflows. This training should even constitute real attacks based on real vulnerabilities and TTPs in play in the field. Obviously, most security controls and tools go through a similar sandbox deployment process, and for good reason. Because cybersecurity environments are so multifaceted and complex, with organizations deploying dozens of tools, unexpected interactions and behaviors can emerge.

LLMs have an increased risk of unforeseen events, so their integration, usage, and maintenance protocols should be closely monitored. Once the CISO is satisfied that an LLM is sufficiently secure and effective, they can proceed with a gradual and methodical deployment. For best results, deploy the LLM first for the least critical and most complex tasks and gradually introduce it into the most cognitively demanding workflows and processes that require good human judgment.

Aqsa Taylor, Director of Product Management, Gutsy

Texas awards $170 million contract to SAIC for IT and cybersecurity services

Crystal Ball 2025: Insights into AI, Automation, and Insider Threat Detection

IBI Union and Atlantic Overseas Bank announce $1 billion investment in AI and cybersecurity

Five Strategies to Mitigate LLM Risks in Cybersecurity Applications

Related Posts

Texas awards $170 million contract to SAIC for IT and cybersecurity services

Crystal Ball 2025: Insights into AI, Automation, and Insider Threat Detection

IBI Union and Atlantic Overseas Bank announce $1 billion investment in AI and cybersecurity