IP recap: Generative AI lawsuits and what they mean for startups

Engine
6 min readAug 15, 2024

--

By Jill Crosby, Policy Analyst, Engine Advocacy & Foundation

High-profile lawsuits over generative AI are poised to create rules of the road for the entire AI ecosystem, including startups. Since 2023, artists, authors, newspapers, record labels, and others, have sued some of the biggest names in AI. As litigation progresses through the legal system, courts will set interpretations of copyright law and standards of enforcement, and the outcomes of the cases will have an impact far beyond the companies being sued.

Startups need to be considered when analyzing the intersection of copyright and AI given the range of startups and their uses of AI across the economy. Engine has highlighted how startups have developed generative AI to, for example, improve water quality, reduce energy waste, improve equitable access to financial systems, and create better health outcomes. The outcomes of ongoing litigation will impact innovation, how the technology is built, and could determine the ability of startups to participate in the AI ecosystem altogether.

How copyright & AI should work

All AI models, not just generative AI, ingest training data similar to how humans gather and analyze content to produce a transitory, non-infringing copy. It is crucial to distinguish between copyrighted content being ingested into training data as inputs (which should be non-infringing) and whether the outputs generated are infringing (which is possible, despite AI developer’s intentions) as separate and distinct legal questions. To maintain an AI ecosystem that is innovative, competitive, and accessible to startups, Engine has long supported that the ingestion of copyrighted content into training data be considered a lawful, noninfringing use. This would be the most efficient, straightforward, and low-cost way to resolve the open copyright questions.

That’s in contrast to relying on fair use, which can only be raised once an AI company is involved in expensive litigation. The AI companies currently being sued are preparing fair use defenses. In considering those defenses, courts will weigh (1) the purpose and character of the use; (2) the nature of the copyrighted work; (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and (4) the effect of the use upon the potential market for or value of the copyrighted work. Even if courts agree that these defendants’ training of AI models is protected by fair use, a fair use finding is fact-specific, meaning different courts considering different cases are unlikely to create uniform and easily followed guidance for the entire field of evolving technology.

Recap of On-going IP Litigation:

Artist and author suits

The numerous artist-led lawsuits have gained notoriety and are progressing in the courts. In the cases Silverman v. OpenAI, Inc. and Tremblay v. OpenAI, Inc. filed last July, authors sued the technology company for copying their books from online repositories of written works to use as training data without their consent. A district court judge dismissed several claims brought by the authors, but core allegations about whether the use of copyrighted material in training data sets is directly infringing will proceed to trial. The Andersen v. Stability AI Ltd. case concerning the generative AI service using artists’ images without their consent will also continue with the question of direct copyright infringement at the forefront.

If the analysis moves to fair use, precedent established in Authors Guild, Inc. v. HathiTrust and upheld in Authors Guild v. Google, Inc. should guide the courts’ thinking about whether the inputs used by the generative AI training to produce outputs adds new meaning to its purpose and character to qualify as fair use. These cases held that the mass copying of copyrighted content to provide a search function is a fair use despite the use’s commercial nature.

News and visual media company suits

One of the most notable lawsuits is The New York Times Co. v. Microsoft Corp., OpenAI in which The Times alleges Microsoft and OpenAI unlawfully copied millions of articles to use as AI training data. Despite the AI companies’ claim that the unlicensed copying of articles to train AI models is transformative, The Times argues that the AI outputs compete with and mimic the original articles, so fair use does not apply (despite the precedent in Sega Enterprises, Ltd. v. Accolade, Inc. where the copying of copyrighted content for a commercial purpose was found to be a fair use). Following The Times lead, eight newspapers, including the New York Daily News and Chicago Tribune, sued Microsoft and OpenAI raising the same infringement claims. Both newspaper publisher filings stand out because they’re asking the court to take the extreme step of forcing the companies to destroy all models and training sets with the publishers’ content.

In the Getty Images (US), Inc. v. Stability AI, Inc. case, Getty Images alleges Stability AI used over 12 million of their photographs to train its generative AI image model, which produces allegedly infringing reproductions of the copyrighted images. The significance of the Getty Images case is premised on the filing jurisdictions rather than the repetitive copyright infringement argument being made. In addition to the filing in the Delaware federal court, Getty Images filed a separate case in early 2023 against Stability AI in the UK that is now moving to trial after Stability AI’s failed attempt at a dismissal. Fair use is not a defense to copyright infringement in the UK as it is in the U.S., so the case outcomes can determine whether one country’s laws may be more favorable to AI developers.

Music publisher suits

Major record labels, including Universal Music Group, Sony Music Entertainment, and Warner Records, recently filed copyright infringement suits against the AI-generated music companies Suno and Udio. The lawsuits specifically allege that the AI services produce convincing music recording imitations by copying vast quantities of copyrighted music recordings. Suno and Udio admitted in their legal response on August 1 that they used copyrighted songs as part of training data, which they call a fair use. The record labels argue that fair use does not apply because Suno’s and Udio’s AI model takes key features of the copyrighted recordings and undermines commercial music markets.

Takeaways for Startups & Innovation

For now, the generative AI litigation is progressing to discovery and initial ruling phases. We may be years away from final decisions in these cases, which could form the basis of whether and how startups can use copyrighted content in AI training data. Beyond the judicial system, policymakers are actively engaged in analyzing the relationship between AI and copyright. Most notably, the U.S. Copyright Office launched an initiative to examine the use of copyrighted materials in AI training and published Part 1 of their Report on Copyright and Artificial Intelligence addressing specifically digital replicas in July. In Congress, the COPIED Act aims to give copyright owners control over content by requiring their consent for its use as training data, yet the bill would ultimately harm the innovation ecosystem with this complete circumvention of fair use. In the meantime, it is important for startups to stay aware of the relevant regulations and laws in their respective jurisdictions and whether the training data sets utilized in their AI models include copyrighted material.

Disclaimer: This post provides general information related to the law. It does not, and is not intended to, provide legal advice and does not create an attorney-client relationship. If you need legal advice, please contact an attorney directly.

Engine is a non-profit technology policy, research, and advocacy organization that bridges the gap between policymakers and startups. Engine works with government and a community of thousands of high-technology, growth-oriented startups across the nation to support the development of technology entrepreneurship through economic research, policy analysis, and advocacy on local and national issues.

--

--

Engine

Engine is the voice of startups in government. We are a nonprofit that supports entrepreneurship through economic research, policy analysis, and advocacy.