In this Cybersecurity Law & Strategy article, Rachel Miller explains how the unprecedented surge in artificial intelligence investment – exemplified by OpenAI’s recent $122 billion funding round – is fundamentally reshaping the commercial contract landscape for technology and data companies. As AI tools become embedded in everyday business operations, legal and business teams on both sides of the negotiating table are confronting an entirely new class of risks that legacy contractual frameworks were simply not designed to address.
Rachel walks through the key pressure points driving this shift, including the rapid proliferation of AI addenda layered onto existing SaaS agreements, the evolving challenge of protecting proprietary algorithms, analytics, and trained model weights as core intellectual property, and the emerging, yet legally complex, opportunity to monetize data assets through AI training licenses. Equally significant is the tension between vendors’ growing need to use customer data to improve their own models and customers’ resistance to any use of their data beyond direct service delivery.
“The fundamental challenge for lawyers advising in this space is that they are negotiating and drafting contracts that govern technologies, business models and regulatory obligations that are still in flux,” Rachel wrote. “…A skilled lawyer who understands both their client’s business and the technology at issue can cut through that dynamic.”
Read the original article here →
The surge in artificial intelligence investment, as highlighted by OpenAI’s recent $122 billion funding round, underscores the widespread adoption of AI tools not only by consumers but by technology companies themselves. With this adoption comes a new set of challenges for commercial contracts, forcing business and legal teams to rethink frameworks built for a different era. Whether a company is acquiring a technology firm or using AI services such as customer chatbots or data analysis programs, businesses on both sides of the table are being exposed to a new class of legal risks and making decisions that will impact their value proposition. To address these challenges, businesses and investors are increasingly including AI-specific representations and warranties in contracts and agreements, reassessing longstanding data strategies and sharpening their focus on protecting the rights in data that parties provide to one another.
AI Addenda in Commercial Contracts
One visible sign of these changes is the rapid proliferation of AI addenda in vendor contracts. Because many software as a service (SaaS) contracts were signed prior to the current generative AI boom, these addenda are often layered onto existing agreements to modify underlying contractual obligations to clarify AI usage, define responsibilities, mitigate legal exposure, and address ownership of both customer data and AI-generated content. In many cases, larger customers – particularly those in regulated industries like financial services – are including these addenda in the form contracts they send to vendors, incorporating emerging AI regulatory frameworks such as the EU AI Act and a growing body of U.S. state AI laws.
These addenda are essential tools for customers to ensure they manage a category of risk that standard SaaS contracts were not drafted to address. Customers worry about inaccurate outputs, compliance obligations under sector-specific and AI specific regulations, and questions about accountability when AI is inserted into decision-making processes. An increasingly significant concern is the provenance of the vendor’s underlying training data. If a vendor’s model was trained on data that was unlicensed, scraped without authorization or otherwise obtained in violation of third-party rights, the customer may face downstream exposure.
That said, these addenda are often drafted with the broadest possible use case in mind and do not always reflect a particular vendor’s practices or risk profile. Vendors that are asked to sign these addenda would be well served to read them carefully. Vendors should ensure that any representations, warranties and obligations accurately reflect how their AI systems actually work and that the definition of “AI services” is scoped to include only the AI features that power the services delivered to a customer, with explicit carveouts for internal productivity and efficiency tools.
Vendors: Protecting IP Beyond the AI System
Proprietary algorithms have become central to competitive advantage in technology, powering AI, analytics and autonomous systems for many SaaS and similar vendors. Such vendors are typically careful to state that the services and systems they provide constitute their intellectual property for which the customer receives a license or access rights. Expanding this framework to cover AI systems and models is a natural extension of existing practice: Vendors should update their standard intellectual property ownership language to explicitly include AI models, trained weights, fine-tuned versions and model architectures.
In this framework, output generated by the vendor’s systems is generally owned by the customer. However, for technology and data companies that provide analytics, a more nuanced and often overlooked risk involves the treatment of their proprietary outputs. These companies face a two-front contractual challenge: ensuring their analytics are not reduced to mere “outputs” owned by their customers and also ensuring that the underlying data and methodologies that power those analytics remain protected. A vendor may be comfortable allowing a customer to own a particular output but must be precise that any analytics embedded in or used to generate that output remain the vendor’s retained intellectual property. This is not unlike how vendors have long structured “deliverables” clauses to transfer certain work product to customers while expressly preserving rights in preexisting IP and methodology.
An additional consideration is that in these cases a vendor may also want to restrict how the customer uses the output downstream. Permitting a customer to feed vendor-generated outputs into its own or a third party’s AI system creates several risks that vendors need to anticipate and address contractually. The most obvious risk is that a customer may use the output to train its own model and over time develop the ability to replicate the vendor’s analytical capabilities without the underlying data or methodology. But the risks go further. Even without formal model training, AI systems can absorb and operationalize vendor outputs in ways that are difficult to detect and harder to unwind. There is also a competitive intelligence risk: A customer could use an AI system to generate insights that effectively substitute for the vendor’s service without ever technically training a model. These concerns are compounded where the vendor’s output incorporates data that is itself subject to third-party licensing restrictions such as prohibitions on AI use.
Vendors need to address all of these scenarios expressly, both in their own agreements with customers and by ensuring their upstream data licenses give them the rights they need to deliver and protect their services For their part, customers can often negotiate limited rights to use vendor output with internal AI tools or external tools where the customer maintains a separate instance, provided that such use excludes model training, competitive development and other practices that would implicate the risks raised above.
Opportunities to Monetize
While protection is essential, sophisticated data companies are beginning to explore the opportunity to monetize their data assets through AI training licenses. Their goal is to create a net-new revenue stream without cannibalizing existing subscription or analytics services. In my practice, I am seeing more companies actively looking for ways to do this, and increasingly interesting business models are emerging around the opportunity.
For technology companies willing to permit customers or third parties to use their data for AI training, the key legal and commercial challenge is to structure these arrangements so that training licenses do not effectively replace the underlying service. When drafting these clauses, it is best to ensure that (i) permitted use of the trained model excludes using a model trained on proprietary data to compete with the licensor’s core product; (ii) field-of-use restrictions that limit the contexts in which the trained model may be deployed are included, as are restrictions preventing the licensee from reselling or publicly distributing outputs that replicate the licensor’s analytics; and (iii) vendors have audit rights sufficient to confirm ongoing compliance (which is trickier, and customers may not agree).
Beyond these protections, companies should think carefully about how training rights are scoped. Some agreements limit use of data to improving a specific model or service while others permit broader use in developing general-purpose models. The difference is not merely contractual; it can determine whether a company is participating in or effectively subsidizing the development of foundational AI systems that may ultimately compete with their own offerings. Whether training rights are exclusive or nonexclusive – and whether they are limited by field of use, model type or duration – deserves attention as well. Broad, perpetual rights may increase near-term deal value but can enable counterparties to incorporate proprietary data into models that extend well beyond the intended scope of the relationship. Companies should also assess how agreements address derivative use. Even where contracts restrict direct redistribution of datasets, training a model on those datasets may allow counterparties to capture and operationalize their value in ways that are difficult to monitor or unwind after the fact.
Finally, for companies that are considering licensing data that includes personal information, anonymization and aggregation are prerequisites. But such companies need to ask whether their practices satisfy applicable privacy law. Data that may appear to be anonymized may still carry re-identification risks, particularly when combined with other datasets a licensee may already have.
The Vendor’s Need for Customer Data: Improving Their Own AI
Technology and data vendors increasingly need rights to use customer data to improve their own models. While this need appears to be in direct tension with what customers are trying to restrict, in some cases the customer data is necessary so that vendors can deliver better services across their customer base.
Sophisticated customers have become increasingly resistant to any use of their data beyond direct service delivery, and the position commonly demanded in customer-drafted agreements is that the vendor may only use data for purposes of performing its contractual obligations. This tension is real but navigable. Practical resolutions gaining traction in negotiations include: (i) permitting vendors to use this data, provided that it has been anonymized and aggregated across multiple customers; (ii) opt-in provisions (rather than opt out) for customers willing to participate in model improvement programs, often in exchange for pricing concessions or enhanced features; and (iii) strict use limitations ensuring that data used to improve the vendor’s service may not be repurposed to train general-purpose or commercially distributed models.
Of course, personal data warrants additional scrutiny. Once embedded in a model’s parameters through training, personal data is practically impossible to delete or unlearn, meaning that General Data Protection Regulation and California Consumer Privacy Act deletion rights and data subject access obligations become difficult to impossible to satisfy. As a result, many vendors exclude personal data from AI training pipelines or anonymize the data before any training use.
Vendors should define the applicable anonymization standard within the contract itself rather than by reference to applicable law alone, which varies across jurisdictions and continues to evolve. There is also a meaningful downstream liability risk. If a vendor trains on customer data that itself contains third-party personal information, the vendor may face regulatory exposure that neither party anticipated and that the contract failed to address.
Practical Takeaways for Counsel
The fundamental challenge for lawyers advising in this space is that they are negotiating and drafting contracts that govern technologies, business models and regulatory obligations that are still in flux.
Because of that uncertainty, both parties may come to the table driven as much by fear of the unknown as by the specifics of their deal, each seeking to protect against eventualities that neither can fully anticipate or define. For customers who are not deeply steeped in the technology, that fear can translate into overreach with broad protections that attempt to address every conceivable risk but in doing so create obligations that are neither practical nor proportionate to the relationship. For vendors, the uncertain landscape can cause resistance if they fail to distinguish between clients’ asks that represent genuine business risks to them and those that they can easily control for. A skilled lawyer who understands both their client’s business and the technology at issue can cut through that dynamic. By knowing where the real risks lie, counsel can help customers focus their protections on what actually matters rather than casting the widest possible net and help vendors identify where concessions are reasonable and where they are not.
Against that backdrop, the priorities are clear. Audit existing agreements for AI-related gaps before customers or regulators surface them. Ensure that IP ownership language is precise enough to protect not just the AI system itself but the analytics and methodologies embedded in what it delivers. Structure any data monetization arrangements carefully enough that they generate new revenue without undermining the core business. Negotiate data use rights, whether as a vendor seeking to improve its models or a customer seeking to protect its information, with specificity rather than relying on boilerplate. And build in mechanisms for agreements to evolve as the law does.
Reprinted with permission from the May 2026 issue of Cybersecurity Law & Strategy. © 2026 ALM Media Properties, LLC. Further duplication without permission is prohibited. All rights reserved.