Distinguished engineer. AI/ML to Cisco. Former Chief Data Scientist at Kenna Security. Co-founder of Dharma Platform, a Forbes 30 Under 30.
The recent AI/ML inflection point in enterprise C-suites around the world has a specific cause. The buzz around the rise of ChatGPT is In fact on the potential that the Open AI architecture promises. It is the combination of two proven machine learning approaches: large language models (LLM) and reinforcement learning from human feedback (RLHF).
LLMs are advanced prediction tools that generate coherent text sequences by analyzing large amounts of text data. Their ability to produce relevant text varies depending on the application and context, highlighting the challenges of unsupervised learning in various use cases. In RLHF, a machine learning agent learns optimal actions through interaction and environmental feedback, integrating automated and human evaluations to refine its decisions.
A key innovation in GPT models is to combine LLMs with a secondary model that evaluates text quality, using human-ranked results to train this model. This combination improves the LLM’s ability to align with human preferences, presenting a synergistic approach to improving AI-generated text.
The future of AI/ML lies in enabling a new way to interact with knowledge, one that does not require the skills that are already causing a security workforce shortage . Vendors have access to a wealth of security data that the average customer or company using their technology does not.
We need to encode this data into LLMs which represent the language of security. Basically, this means training new neural networks that can perform specific security analytics or tasks. We need to use user interaction data to deduct Human feedback for reinforcement learning. The holy grail of security automation has eluded vendors as, over the past two decades, they have struggled to create tools to make action possible. Today we are able to build models not only of what is possible, but rather what is optimal for every situation our clients may find themselves in.
Large language models
Security is a language. Few practitioners speak it. In fact, in 2013, Dan Geer and Richard Thieme predicted the dying breed of security generalist— there are too many subfields, too many specializations and areas to take care of. Therefore, a network analyst may not speak the language of the vulnerability manager. The analyst needs models that can represent and translate security events and results.
The shortage of security personnel is real and growing. Training new analysts is becoming increasingly difficult. The current zeitgeist in XDR is to enhance SOC analysts with tools. The biggest security opportunity is training through translation. This means new ways to interact with all of an organization’s security data, first training a general model (or multiple models!) that is aware of all types of data that a security analyst can see, then by building a second, more contextual model. aware of the environment of this individual business.
Asking this model a question during a survey and getting an answer that may require two SQL joins and a pivot table between two providers is the holy grail. This is only possible with a broad range of training data: assets spanning configuration, network, EDR, NDR, threat intelligence and application security. You need these inputs to form a set of models that will be useful.
Reinforcement learning using human feedback
UX is at the heart of machine learning. The big opportunity in security is to systematically collect data on end-user behavior and use that data as a human in the loop. We have years of data on tracking actions, clicks, surveys, and searches on our assets. A smart ETL pipeline would use this data to train a second model that models analyst preferences in similar situations: “Should we automatically patch this vulnerability, or is there a possibility of downtime? » “Should we quarantine this Windows machine?” » These are questions that have been answered dozens of times, but perhaps not on this client’s site.
And after?
Data comes first. There are three approaches that will allow the industry to accelerate the bringing to market of these new technologies.
1. Schema consistency across products: The entropy of data sets is the proven weakness of all versions of LLM. The more inconsistency there is between structured and unstructured datasets, the harder it is for a model to infer that two entities are the same – and transformer representations of similar entities are at the heart of LLMs and the reason for which they are so good at linguistic representation. My recommendation is to formalize data schemas where appropriate. This can be done by mandate, top-down, or through machine learning. Having large amounts of entropy is itself a signal, and there is a suite of algorithms that can infer the meaning of fields in datasets by measuring entropy. Is it the device name or a website? Much of this depends on the entropy of the dataset we’re looking at. Performing this data cleaning as early as possible means a 1,000x ROI in the accuracy and efficiency of the models we build from our data.
2. Risk as a result measure: Reward functions are the hardest part of reinforcement learning. While we build this internal human feedback capability, the industry already has a great proxy for security reward features that ChatGPT did not have: risk measurement. If we give RLHF models a reward function of minimizing risk models that already exist in production, we can start creating useful models before end users guide the reward functions. Risk becomes the benchmark in a very real sense.
3. UX as the key to unlocking user interaction data: We need to not only enable user interactions that respond to stories, but also require security industry UX teams to create and capture user activity in the same way as e-commerce sites. Every click is an advantage in the world of AI modeling.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Am I eligible?