You’d be hard-pressed to find a cybersecurity vendor that doesn’t present its AI-based defenses as a panacea. But AI defenses are by no means a silver bullet – and should not be treated as such.
AI data poisoning: a hidden threat on the horizon
AI data poisoning is not a theoretical concept, but a tangible threat. The newly updated OWASP for LLMs lists Training Data Poisoning among the top 10 vulnerabilities (LMM03). Model creators do not have complete control over the data entered into an LLM. Cybercriminals exploit this fact by using poisoned data to bypass AI-based security defenses, leading them astray and teaching them to make the wrong decisions. This deliberate manipulation becomes a silent accomplice, opening the door to exploitation of unsuspecting systems.
Imagine you have this ultra-intelligent AI model responsible for detecting anomalies in your system. What if someone sneaked data into their training or development, deliberately teaching them to ignore real threats?
For the attacker, it’s all about hiding the data so that it appears legitimate. In some cases, bad actors will use real data and only change a few numbers – anything to fool the AI into thinking it’s legitimate. Essentially, data is used to teach AI to make bad decisions.
Fake it til you make it: Bypassing AI security using harvested data
In a more sophisticated way, it sometimes happens that the data used to fool the A.I. East the real deal. In this scenario, adversaries harvest real data (usually stolen) and replay it to circumvent AI models. This works in particular by using fingerprints collected as well as recorded mouse movements and gestures, hardcoded and randomized to develop an automated script. Although the data is technically real, it is inauthentic because it was not originally generated by the person using it.
This data is then fed into the security tool’s AI model at scale, which can bypass defenses if the AI or ML model cannot detect that the data has been collected. Let it sink in: Even the most advanced AI models are ineffective for security defense if they cannot verify that the data presented is authentic and not collected (i.e. fake).
Outdated anti-bot defenses, unprepared for AI attacks
Teams that rely on traditional bot management solutions can miss more than 90% of malicious bot requests due to these new types of data poisoning and fake data attacks, in which data and automation are used to bypass their ML and AI security detections.
To illustrate the scale of this problem, take a look at what happened when Kasada bot defense was enabled for a leading streaming provider:
Kasada was deployed behind CDN-based bot detection for this streaming company. At the 4:15 p.m. go-live event, the large spike in bots detected and mitigated (red) was the result of data collection and replay that evaded CDN AI detections. Since CDN-based detection accepted falsified data as authentic, it had a 98% false negative rate before Kasada was implemented. Human activity (blue) was completely dwarfed by the scale of the robot problem.
Ironically, the client didn’t realize that the vast majority of their traffic wasn’t real since the data collected looked like humans, which caused a wake-up call for both their fraud and marketing teams (ad fraud digital).
One of the most important elements of realizing the potential of AI for cyber defense, while minimizing the impact of data entry tampering, is proof of execution (PoE). This is the collective term Kasada uses for client- and server-side techniques designed to validate that data presented to the AI detection model is indeed authentic. PoE can verify that data presented to the system is generated and executed in real time.
Three Steps Defenders Can Take Now
Data-driven attacks against AI demand attention from defenders using or considering security solutions that use machine learning or AI models. Here are some concrete recommendations you can follow to protect your organization against such attacks:
- Monitor abnormal behavior: Keep a close eye on security solutions that rely heavily on AI models. Know what human day-night cycles should look like and if they are observable in your traffic.
- Diversify your defenses: Don’t rely solely on server-side AI learning for bot detection. Implement rigorous client-side detections and validation controls to ensure your security controls work the way you want them to.
- Stay vigilant and proactive: Make sure your bot detection and mitigation solutions can validate the authenticity of the data presented to the system. A strong sense of what is real is crucial.
As more security solutions add “AI” to their technology, the ability to identify and stop data poisoning and collected data becomes paramount as adversaries seek to circumvent security protections. ‘AI. AI has opened up new attack surfaces and data-driven evasion techniques for the adversary for the defender to address. The contradictory game of cat and mouse continues and the question remains: are your defenses sufficiently dynamic and adaptive Try the challenge ?
Kasada was designed with defense in depth in mind to thwart the latest automated attacks and the motivated adversaries behind them. Request a demo to find out how our experts can help you today.
The post office AI data poisoning: How misleading data escapes cybersecurity protections appeared first on Kasada.
***This is a Security Bloggers Network syndicated blog from Kasada Written by Neil Cohen. Read the original message at: https://www.kasada.io/ai-data-poisoning/