In recent months, we’ve seen government and business leaders place greater emphasis on securing AI models. If Generative AI is the next big platform to transform the services and functions that society as a whole depends on, ensuring technology is reliable and secure must be the top priority for businesses. While the adoption of generative AI is in its infancy, we need to establish effective strategies to ensure it from the start.
THE IBM Institute for Business Value found that although 64% of CEOs face significant pressure from investors, creditors and lenders to accelerate the adoption of generative AI, 60% are not yet developing a coherent approach to AI generative at the enterprise level. In fact, 84% are concerned on the widespread or catastrophic cybersecurity attacks that the adoption of generative AI could lead to.
As organizations determine how best to integrate generative AI into their business models and evaluate the security risks the technology could introduce, it is useful to examine the top attacks that malicious actors could execute against generative AI models. ‘AI. Although only a small number of actual AI attacks have been reported, IBM X-Force Red tested models to determine the types of attacks most likely to appear in the wild. To understand the potential risks associated with generative AI that organizations must mitigate when adopting the technology, this blog will outline some of the attacks that adversaries are likely to pursue, including rapid injection, data poisoning, Model escape, model extraction, inversion and provisioning. chain attacks.
Types of security attacks represented by their level of difficulty for a malicious actor to execute and their potential impact on a business.
Rapid injection
Rapid injection attacks manipulate Large Language Models (LLM) by creating malicious inputs that seek to override the system prompt (initial instructions for the AI provided by the developer). This may cause a model to be jailbroken for perform involuntary actionscircumvent content policies to generate misleading or harmful responses, or reveal sensitive information.
LLMs are biased toward obedience to the user and are susceptible to the same deceptions as humans, similar to social engineering. Therefore, it is very easy to bypass the content filters in place, often as simple as asking the LLM to “pretend it’s a character,” or for “play a game.” This attack can cause reputational damage by generating harmful content; degradation of services, by creating prompts that trigger excessive use of resources; and intellectual property or data theft, by revealing a confidential system prompt.
Data Poisoning
Data poisoning attacks consist of adversaries tampering with the data used to train AI models in order to introduce vulnerabilities, biases, or change the model’s behavior. This can potentially compromise the effectiveness, security, or integrity of the model. Assuming models are trained on closed datasets, this requires a high level of access to the data pipeline, either through access from a malicious insider or through sophisticated privilege escalation through alternative means. However, models trained on open source datasets would be an easier target for data poisoning because attackers have more direct access to the public source.
The impact of this attack could range from disinformation tries to kill Die Hard 4.0depending on the threat actor’s objective, fundamentally compromising the integrity and effectiveness of a model.
Model Escape
A model evasion attack would allow attackers to modify inputs to the AI model in a way that causes it to misclassify or misinterpret them, thereby changing its intended behavior. This can be done visibly to a human observer (e.g., by putting small stickers on stop signs so that self-driving cars ignore them) or invisibly (e.g., by changing individual pixels of a image by adding noise that fools an object recognition model).
Depending on the complexity of the AI model, this attack may vary in complexity and executability. What is the format and size of the model inputs and outputs? Does the attacker have unlimited access? Depending on the goal of the AI system, a successful pattern evasion attack could have a significant impact on the business. For example, if the model is used for security purposes or to make important decisions such as approving loans, overriding the intended behavior could cause significant harm.
However, given the variables here, attackers opting for the path of least resistance are unlikely to use this tactic to advance their malicious objective.
Model extraction
Model extraction attacks aim to steal the intellectual property (IP) and behavior of an AI model. They are performed by interrogating it thoroughly and monitoring inputs and outputs to understand its structure and decisions, before attempting to reproduce it. However, their execution requires considerable resources and knowledge, and as the complexity of the AI model increases, so does the level of difficulty of executing this attack.
Although the loss of intellectual property could have significant competitive implications, if attackers have the skills and resources to successfully perform model extraction and replication, they will likely find it easier to simply download an open source model and customize it to behave the same way. Additionally, techniques such as strict access controls, monitoring, and rate limiting significantly hinder adversarial attempts without direct access to the model.
Reversal attacks
While extraction attacks aim to steal the behavior of the model itself, inversion attacks aim to find information about a model’s training data, even if they only have access to the model and at its exits. Model inversion allows an attacker to reconstruct the data on which a model was trained, and membership inference attacks can determine whether specific data was used in training the model.
The complexity of the model and the extent of the information derived from it would influence the level of complexity of executing such an attack. For example, some inference attacks exploit the fact that a model generates a confidence value as well as a result. In this case, attackers can attempt to reconstruct an input that maximizes the returned confidence value. That said, attackers are unlikely to have the unrestricted access required to a model or its outputs to make this practical in the wild. However, the potential for data leaks and privacy violations carries risks.
Supply chain attacks
AI models are more integrated than ever into business processes, SaaS applications, plugins, and APIs, and attackers can target vulnerabilities in these connected services to compromise the behavior or functionality of the models. Additionally, companies are using freely available models from repositories like Hugging Face to get a head start on AI development, which could incorporate malicious features like Trojans and backdoors.
Successful exploitation of connected integrations requires in-depth knowledge of the architecture and often the exploitation of multiple vulnerabilities. Although these attacks require a high level of sophistication, they are also difficult to detect and could have a significant impact on organizations without an effective detection and response strategy.
Given the interconnected nature of AI systems and their increasing involvement in critical business processes, protecting against supply chain attacks should be a high priority. Verification of third-party components, monitoring of vulnerabilities and anomalies and implementation DevSecOps best practices are crucial.
Securing AI
IBM recently presented the IBM framework for securing AI — help customers, partners and organizations around the world better prioritize the most important defensive approaches to secure their generative AI initiatives against anticipated attacks. The more organizations understand what types of attacks are possible against AI, the more they can improve their cyber preparedness by developing effective defense strategies. And while it will take time for cybercriminals to invest in the resources needed to attack AI models at scale, security teams have a rare time advantage: the ability to secure AI, before attackers place the technology at the center of their target. No organization is exempt from the need to establish an AI security strategy. This includes both models they actively invest in to optimize their business and tools introduced like shadow AI by employees looking to improve their productivity.
If you want to learn more about securing AI and how AI can improve the time and talent of your security teams, read our authoritative guide.