COMMENT
Picture this: It’s a Saturday morning and you’ve made breakfast for your family. The pancakes were golden brown and looked good, but everyone, including you, got sick shortly after eating them. Unbeknownst to you, the milk you used to make the dough has been expired for several weeks. The quality of the ingredients impacted the meal, but everything looked good on the outside.
The same philosophy can be applied to artificial intelligence (AI). Whatever its objective, the outcome of AI is directly linked to the quality of its input. As the popularity of AI continues to grow, security concerns around data fed into AI are being called into question.
Today, the majority of organizations are integrating AI into their business operations to some extent, and bad actors are taking notice. Over the past few years, a tactic known as AI poisoning has become increasingly prevalent. This new malicious practice involves injecting misleading or harmful data into AI training sets. The tricky part about AI poisoning is that even if input is compromised, output can initially continue normally. Only when a malicious actor has a firm grip on the data and launches a full-fledged attack do deviations from the norm become evident. The consequences range from mildly embarrassing to damaging a brand’s reputation.
This is a risk that affects organizations of all sizes, even today’s largest technology providers. For example, over the past few years, adversaries have launched several large-scale attacks to poison Google’s Gmail spam filters and even turned Microsoft’s Twitter chatbot hostile.
Defending against AI data poisoning
Fortunately, organizations can take the following steps to protect AI technologies from potential poisoning.
-
Create a comprehensive data catalog. First, organizations must create a live data catalog that serves as a centralized repository of information fed to their AI systems. Whenever new data is added to AI systems, it should be tracked in this index. Additionally, the catalog should be able to classify data flowing through AI systems according to who, what, when, where, why and how to ensure transparency and accountability.
-
Develop a normal baseline for users and devices interacting with AI data. Once security and IT teams have a solid understanding of all data in AI systems and who has access to it, it is important to develop a baseline of normal user and device behavior.
Compromised credentials are one of the easiest ways for cybercriminals to break into networks. All a bad actor has to do is play a guessing game or buy one of 24 billion username and password combinations available on the cybercriminal market. Once there is access, a malicious actor can easily hack their way into AI training datasets.
By establishing baseline user and device behavior, security teams can easily detect anomalies that could indicate an attack. Often, this helps stop a threat actor before an incident escalates into a full-blown data breach. For example, let’s say you have an IT manager who typically works out of the New York office and oversees AI data training sets. One day it shows that it is active in another country and adding large amounts of data to AI. If your security team already has a baseline of user behavior, they can quickly determine that this is abnormal. Security could then either speak to the executive and verify that they were taking the action, or, if they did not, temporarily deactivate their account until the alert is investigated. thoroughly to avoid any further damage.
Take responsibility for AI training sets
Just as you should check the quality of ingredients before preparing a meal, it is essential to ensure the integrity of AI training data. Artificial intelligence is closely linked to the quality of the data it processes. The implementation of improved guidelines, policies, monitoring systems and algorithms plays a central role in ensuring the safety and effectiveness of AI. These measures protect against potential threats and enable organizations to harness the transformative potential of AI. It’s a delicate balance in which organizations must learn to leverage AI capabilities, while remaining vigilant in the face of an ever-changing threat landscape.