Real-time data capture and hidden AI training practices are drawing regulatory scrutiny and litigation tactics

Once seen as innovative tools for efficiency and customer engagement, chatbots may now represent one of the most significant emerging risks in cyber liability.
As artificial intelligence rapidly transforms how companies interact with customers, insurers and legal experts warn that the combination of real-time data collection, opaque vendor relationships, and AI model training practices could set the stage for a surge of privacy class actions and regulatory scrutiny.
“Wrongful collection of data is one of the most common types of third-party litigation claims that we’ve seen in the cyber space over the past three years,” said Jennifer Wilson, head of cyber at Newfront. “It’s second only to ransomware and business email compromise.”
Chatbots, which capture user input in real time, are now being scrutinized under the same privacy laws fueling litigation over pixel tracking and behavioral analytics tools. Wilson told Insurance Business that claims are being filed under GDPR (General Data Protection Regulation, an EU law), CCPA (California Consumer Privacy Act), and BIPA (Biometric Information Privacy Act) for practices such as saving conversation logs without user consent, recording communications without disclosure, and sharing confidential chat data with third-party AI providers.
“The key to these privacy concerns lies in the disclosure and consent,” Wilson said. “Are you obtaining consent prior to collecting the confidential data? Is that consent opt-in or opt-out? Insurers prefer opt-in. Are you disclosing what you are doing with the data?” Failure to meet those standards can trigger significant statutory penalties, particularly under laws like Illinois’ BIPA, which allows damages on a per-violation basis.
Chatbots in healthcare and retail under the microscope
Sarah Thompson, head of cyber for North America at MSIG USA, said that while chatbots are the newest technology in focus, the legal theories behind these lawsuits are not new. She compared these to earlier lawsuits in California involving zip code collection at checkout.
“We’ve seen wrongful data collection litigation tied to pixel tracking and behavioral analytics,” Thompson said. “Users are inputting personal or health information into chat interfaces without realizing how that data is analyzed, stored, or shared. It’s not made very clear to the end user how their information is being collected.”
According to Joshua Mooney, head of US cyber and data privacy at global law firm Kennedys, website owners can be held liable even when the data is captured by third-party AI tools operating in the background.
“If the use of a chatbot or AI tool is not properly disclosed to a website visitor, you’re going to see litigation track what we’re now seeing with online tracking and targeted advertising,” Mooney said. “The question becomes whether the website owner was complicit in the unauthorized interception of electronic communications under wiretapping statutes.”
Mooney emphasized that organizations must scrutinize both outward-facing disclosures and internal contractual language with AI vendors. “It’s critical to clarify how information can be used and in what form to ensure that a platform’s subsequent use does not create liability,” he said.
The AI model training risk: Copyright and proprietary data in focus
While privacy claims are rising on the front end, a second wave of litigation is emerging around how AI models are trained. The use of customer chat data, proprietary business information, or copyrighted materials to train large language models presents a high-risk area with evolving legal precedent.
Wilson pointed to a recent $1.5-billion settlement involving alleged misuse of copyrighted material to train AI models. “The outcome suggests that using copyrighted materials for training may be permissible, but the way in which those materials are obtained is at issue,” she said. “Are you going through legal channels and paying for the materials, or are you using pirated content from third-party vendors?” This applies to chatbot-generated data. If user chats are fed into training pipelines without consent, organizations could face both privacy and intellectual property claims.
Carriers are already adjusting underwriting practices as they brace for increasing litigation related to AI technologies. Underwriters are expected to ask detailed questions about chatbot deployment, data handling practices, opt-in mechanisms, and AI vendor contracts. Wilson cautioned that companies relying on boilerplate privacy policies or generic cookie banners may be overlooking major exposures.
“Consent must be clearly documented, disclosures must be specific, and you need traceability over how that data is being used downstream,” she said.
Mitigating liability risks: What companies can do now
As AI chatbots become embedded in every corner of the digital economy, the technology that promised seamless customer experiences may soon be at the center of the next major cyber liability battlefield.
- Use explicit opt-in consent mechanisms for any chatbot that collects personal or sensitive data.
- Disclose the use of chatbots and AI tools upfront, not buried in privacy notices.
- Audit vendor contracts to confirm limitations on data usage and training rights.
- Establish internal AI governance policies for how information is retained, anonymized, or shared.
“This is a fast-evolving area, with guardrails being put in place in real time,” said Wilson. “We are learning as we go; however, with both privacy and copyright, the message seems to be clear: consent, disclosure, and legal channels for access are the best avenues to avoid costly litigation.”
Related Stories
Fetching comments…





