Companies are accumulating more data than ever to fuel their AI ambitions, but they are also concerned about who can access this data, which is often very private in nature. PVML offers an interesting solution by combining a ChatGPT type data analysis tool with the security guarantees of differential confidentiality. Using recovery augmented generation (RAG), PVML can access an enterprise’s data without moving it, eliminating another security issue.
The Tel Aviv-based company recently announced that it had raised an $8 million seed round led by NFX, with participation from FJ Labs and Gefen Capital.
The company was founded by a husband and wife team Shachar Schnapp (CEO) and Rina Galperin (CTO). Schnapp earned his PhD in computer science, specializing in differential privacy, and then worked on computer vision at General Motors, while Galperin earned his master’s degree in computer science with a focus on AI and natural language processing and has worked on machine learning projects at Microsoft.
“A lot of our experience in this area comes from working in large corporations and corporations where we found that things may not be as efficient as we hoped as naive students,” Galperin said. “The main value we want to bring to organizations as PVML is the democratization of data. This can only happen if, on the one hand, you protect this very sensitive data, but, on the other hand, you allow easy access to it, which is synonymous with AI today. Everyone wants to analyze data using free text. It’s much simpler, faster and more efficient — and our secret sauce, differential privacy, makes this integration very easy.
Differential Privacy is far from a new concept. The main idea is to guarantee the privacy of individual users in large datasets and to provide mathematical guarantees for this purpose. One of the most common ways to achieve this is to introduce some degree of randomness into the data set, but in a way that does not alter the analysis of the data.
The team says current data access solutions are inefficient and generate a lot of overhead. Often, for example, a lot of data needs to be redacted to allow employees to securely access the data, but this can be counterproductive because you may not be able to effectively use the redacted data for certain tasks. (plus additional data). data access delay makes real-time use cases often impossible).
The promise of using differential privacy means that PVML users do not have to modify the original data. This avoids almost all overhead and unlocks this information securely for AI use cases.
Almost all big technology companies are now using differential privacy in one form or another and making their tools and libraries available to developers. The PVML team says that this method has not yet been truly put into practice by most in the data community.
“Current knowledge about differential privacy is more theoretical than practical,” Schnapp said. “We decided to move from theory to practice. And that’s exactly what we’ve done: we develop practical algorithms that perform better on data in real-world scenarios.
No incremental privacy work would matter if PVML’s data analytics tools and platform weren’t useful. The most obvious use case here is the ability to chat with your data, all with the guarantee that no sensitive data can leak into the chat. Using RAG, PVML can reduce hallucinations to almost zero and the overhead is minimal since the data remains in place.
But there are other use cases as well. Schnapp and Galperin highlighted how differential privacy also allows companies to now share data between business units. Furthermore, this can also allow certain companies to monetize access to their data to third parties, for example.
“Today in the stock market, 70% of trades are done by AI,” said Gigi Levy-Weiss, general partner and co-founder of NFX. “It’s a taste of things to come, and organizations that adopt AI today will have a head start tomorrow.” But businesses are afraid to connect their data to AI, because they fear exposure – and for good reason. PVML’s unique technology creates an invisible layer of protection and democratizes access to data, enabling monetization use cases today and paving the way for tomorrow.