The rapid adoption of OpenAI’s ChatGPT, for everything from customer assistance to language learning, is forcing businesses to make sure they’re storing, filtering and protecting data so that it can be used for AI-based applications.
According to The Wall Street Journal, use of the large language models that underlie products like ChatGPT and Google’s Bard is forcing companies to spend more time thinking about data management.
It’s not just about self-protection – it’s also about competition. Businesses with an existing data infrastructure can put LLMs to work more quickly. “Racing to out-innovate their competitors, business technology leaders are facing greater demands to deliver on data frameworks that can help make generative AI applications a reality,” the Journal said.
Proper Data Handling
One issue those executives face is the cost of storing, processing and protecting large amounts of data. The Journal said a number of new businesses are launching to help companies interested in generative AI with pre-built services. One of these companies, Granica, can compress data stored in Amazon’s and Google’s cloud platforms, so that it takes up less storage space and costs less to maintain.
Another company Nylas – which provides APIs for email, calendar and contact applications – is testing a Granica service that removes sensitive and personal information while it compresses data.
Nylas Vice President of Engineering John Jung told the Journal that for many applications, “You’d want [the data] scrubbed of [personally-identifiable information] so that you don’t potentially have the models hallucinate, and tell information that is sensitive.” (Hallucination happens when an AI delivers a confident response that’s not backed up by its training data. In other words, when the AI is wrong.)
Then there’s data quality. “The most important thing is not just [to] collect the data, but cleanse, categorize the data, and make sure it’s in a usable format,” said Rob Zelinka, CIO of Jack Henry, a technology firm that serves community and regional financial institutions. “Otherwise you’re just paying to store meaningless data.”
In addition, the growing interest in generative AI has accelerated interest in the “quality, context and privacy” of data that’s used with LLMs, said Erick Brethenoux, a distinguished vice president analyst at the research firm Gartner.
The good news: Analysts expect more startups to zero in on helping companies sift through and control access to their data for use by generative AI, the Journal said.