A Tale of Three Cases: How Fair Use Is Playing Out in AI Copyright Lawsuits

Ropes & Gray LLP
Contact

Ropes & Gray LLP

Key Takeaways

  • The law remains unsettled; fair use outcomes are highly fact-specific and may differ for generative versus non-generative AI models.
  • Copyright holders should focus on demonstrating concrete market harm, including indirect market substitution, when challenging AI training as infringement.
  • AI developers must ensure all training data is lawfully acquired; use of pirated works is unlikely to be protected by fair use.
  • Companies using third-party AI tools should seek strong indemnification and verify the provenance of training data to mitigate infringement risk.

In two recent Northern District of California decisions, AI companies prevailed on a fair use defense after being accused of infringing copyrights in works used to train AI models.1 The decisions, on their face, seem to contrast with an earlier District of Delaware ruling that an AI company was not protected by the doctrine of fair use.2 In this article, we review the court’s reasoning in each of these three cases to elucidate how the battle for a fair use finding is playing out for AI companies, and what this means for future cases.

Bartz v. Anthropic: Legally Obtained Copyrighted Works Can Be Used to Train Large Language Models

In Bartz v. Anthropic PBC, District Judge William Alsup held that Anthropic engaged in fair use when it used purchased, copyrighted books for one-to-one destructive digitization and for training specific AI models.

Plaintiffs in Bartz are a putative class of authors whose books were copied from both legally purchased and pirated sources.3 Anthropic used these books to train Claude, its large language model (“LLM”). As part of the training process, Anthropic purchased, copied, trained Claude on, and ultimately destroyed copies of books.4 Anthropic tore off bindings from books it bought from major book distributors and scanned each page to create digitized versions.5 However, Anthropic also used over 7 million digital copies of books illegally acquired from pirating sites to train its LLMs.

As is common in AI-related copyright cases,6 Anthropic relied on the affirmative defense of fair use, which allows limited use of copyrighted materials without permission of the owners, based on four non-exclusive statutory factors: 1) the purpose and character of the use, 2) the nature of the copyrighted work, 3) the amount of the work copied, and 4) the use’s effect on the existing and potential market.7 Judges have discretion in determining how much weight to give to each of the factors, and a party need not prevail on all of the factors in order to win a fair use determination.8

Judge Alsup analyzed the four factors separately for purchased and pirated works, evaluating “training specific LLMs” and “building a central library” as two different uses of the copyrighted works.9

As to the purchased works, the judge ruled in favor of fair use, both as to “training specific LLMs” and “building a central library.”10 This decision rested largely on the first factor: Judge Alsup deemed the use of LLM technology in the case “among the most transformative we will see in our lifetimes.”11

Similarly, Judge Alsup found the destructive digitization of purchased copyrighted works to create a central library to be a fair use, because “the format change itself… was transformative,” meaning that “the first factor strongly favors” finding fair use.12 In analyzing the format-shifting and AI training use of the purchased copyrighted works, the court found that all factors except the second, the nature of the copyrighted work, favored a fair use finding.13

Plaintiffs argued that training LLMs using copies of their works would result in an “explosion” of competing works.14 Judge Alsup rejected this position, comparing it to “complain[ing] that training schoolchildren to write well” would lead to a similar result.15 Judge Alsup gave little weight to the market impact argument, stating that the Copyright Act is not meant to “protect authors against competition.”16

As to the pirated works, Judge Alsup placed dispositive weight on the original manner of acquisition, rejecting the fair use defense for any use of pirated works.17 Particularly where pirated works are otherwise available for purchase or other lawful acquisition, the court found piracy of copyrighted work to be “inherently, irredeemably infringing.”18 The “spectacularly” transformative nature of the LLM technology did not justify the use of pirated works, particularly because “every pirated library copy was retained” indefinitely, regardless of whether they might ultimately be used for training.19

The court therefore denied summary judgment as to the pirated copies, allowing those claims to go to trial.20

Kadrey v. Meta: Insufficient Evidence of Market Impact by Plaintiffs Can Lose the Case

Two days after Judge Alsup’s ruling, fellow Northern District of California Judge Vince Chhabria issued his own ruling on fair use involving generative AI. Although both judges ultimately granted summary judgment to defendants, finding their use of the books “highly transformative,” they took different approaches to the four-factor analysis.

In Kadrey et al. v. Meta Platforms, Inc., plaintiffs were 13 authors who accused Meta of downloading their copyrighted books from “shadow libraries” to train its “Llama” LLMs.21 Deciding cross-motions for summary judgment, Judge Chhabria applied the four-factor analysis framework, noting at the outset that the fourth factor—market impacts—is “undoubtedly the single most important element of fair use.”22 In doing so, he criticized Judge Alsup for using the “inapt analogy” of teaching schoolchildren creative writing—which Judge Chhabria stated risked “blow[ing] off the most important factor in the fair use analysis.”23

On the first factor, Judge Chhabria found for Meta on the basis that there was “no serious question that Meta’s use of the plaintiffs’ books… was highly transformative,” rejecting arguments that LLM training is analogous to human reading and that the LLM was merely “repackaging” books.24 Judge Chhabria noted that although piracy is relevant for the first factor (partly because it indicates bad faith), it is not necessarily dispositive, and the plaintiffs had not provided any evidence that Meta’s actions subsidized any “shadow library.”25

Judge Chhabria found that the second factor, the nature of the copyrighted work, favored plaintiffs.26 He rejected Meta’s argument that it uses only the “functional elements” of the books, not their creative expression, as the nature of the books’ content was relevant for the use.27 He found that the third factor, the amount copied, favored Meta, as Meta does not “output any meaningful amount” of the books at issue, and its copying was reasonable for the transformative purpose.28

The majority of Judge Chhabria’s analysis was dedicated to the fourth factor, which he stated is the “most important element” and can be near-dispositive.29 Judge Chhabria outlined three theories of potential market harm for plaintiffs challenging LLMs copying their works: 1) regurgitation of copyrighted works, 2) impact on the licensing market, and 3) “market dilution”— i.e., that the output from the LLM is similar enough to the copyrighted works to reduce demand.30 He rejected the first theory because Llama did not “regurgitate” the works.31 He also rejected the second theory, reasoning that because the proposed potential market was coterminous with the theoretical licensing market, the argument was “circular.”32

Judge Chhabria focused his discussion on the third theory—that LLMs could cause “market dilution” for human-written fiction.33 He rejected Meta’s claim that the outputs themselves must be infringing to count as market dilution, but noted that the effect would vary with the nature of the copied work,34 and that market dilution should be viewed relative to a world with LLMs trained only on the public domain, not to a non-LLM world.35 Ultimately, he granted summary judgment in favor of Meta because the plaintiffs had not raised this market harm theory themselves or presented any evidence countering Meta’s evidence against substantial market harm.36 Summary judgment was thus granted to Meta on its affirmative fair use defense.37

While this decision was a victory for Meta, Judge Chhabria emphasized the narrowness of his holding, which binds only the 13 named plaintiffs.38 He stressed that “this ruling does not stand for the proposition that Meta’s use of copyrighted materials to train its language models is lawful.”39 And he asserted that “market dilution will often cause plaintiffs to decisively win the fourth factor—and thus win the fair use question overall.”40

Thomson Reuters v. ROSS: Training on Copied Material Is Direct Infringement, and AI Training Does Not Necessarily Mean Fair Use

Both of these decisions stand in contrast with the first AI fair use decision, issued by the District of Delaware in February 2025. In May 2020, Thomson Reuters (owner of Westlaw) filed a complaint against ROSS Intelligence in the District of Delaware, alleging copyright infringement.41 ROSS had created a legal research search engine intended to compete with Westlaw’s legal search engine. Unlike the AI models at issue in Bartz and Kadrey, ROSS’s tool was not a generative AI tool—which writes content in response to prompts—but rather an AI search engine that answers a user’s legal question by providing results in the form of published judicial opinions. To train its AI tool, ROSS sought to use Westlaw’s copyrighted headnotes (which are essentially summaries of key points of law and case holdings) and numbering system as a database of legal questions and answers. Prior to training its AI tool, ROSS sought licenses to Westlaw’s copyrighted content, but Thomson Reuters refused.

ROSS then entered an agreement with legal analytics company LegalEase to obtain training data in the form of “Bulk Memos” — “compilations of legal questions with good and bad answers” created by lawyers. LegalEase instructed these lawyers to create questions using Westlaw headnotes but to “not just copy and paste headnotes directly into the questions.”42 ROSS bought approximately 25,000 Bulk Memos, which it used to train its non-generative AI search tool. When Thomson Reuters discovered that ROSS built its competing product on memos using Westlaw headnotes, it sued Ross for copyright infringement.43

In February 2025, Judge Stephanos Bibas (a Third Circuit judge sitting by designation in Delaware) granted summary judgment for Thomson Reuters, rejecting all of ROSS’s copyright defenses, including fair use. Specifically, he held that for certain headnotes, summary judgment should be granted on the direct copyright infringement claim. He denied ROSS’s motions for summary judgment on direct copyright infringement and fair use.44

Judge Bibas held that Thomson Reuters prevailed on factors one and four because the character of ROSS’s use of the copyrighted material was not transformative and could harm both the original and derivative markets for Westlaw’s content. Specifically, Judge Bibas found that ROSS was using the Westlaw headnotes for the same purpose as Thomson Reuters—to facilitate a legal research tool—and that it meant to compete with Westlaw by developing a market substitute. Although he found for ROSS on factors two and three, he held that these factors were less important.

Reconciling the Thomson Reuters Outcome with Bartz and Kadrey

Although Thomson Reuters may appear to be at odds with Bartz and Kadrey, the outcomes hinge on the particular facts. Unlike Thomson Reuters, both Bartz and Kadrey involved generative AI. In both Bartz and Kadrey, the judges agreed that the “character of the use” was highly transformative, but in Thomson Reuters the defendant lost on this first factor. And while the plaintiffs in Bartz and Kadrey could not demonstrate market harm, the court in Thomson Reuters found it particularly important that the defendant meant to compete with Westlaw by developing a market substitute, ruling that the plaintiff prevailed on factor four.

Potential Impact on Other AI Copyright Cases

Dozens of AI copyright cases are ongoing nationwide, with new cases continuously being filed. Litigants and courts are likely to look to Bartz and Kadrey for their fair use analysis in the generative AI context.

Under Judge Alsup’s approach, factor four is of lower relative importance in these cases, because only direct market substitution is cognizable, favoring defendants in AI copyright fair use disputes. By contrast, Judge Chhabria viewed the market harm factor as central to the analysis and endorsed recognition of indirect substitution as cognizable harm, potentially advantaging plaintiffs in such cases. Despite the judges’ differing opinions on which factor is most important, it seems that an influential law review article written 35 years ago by Judge Pierre Leval of the Court of Appeals for the Second Circuit has proved useful for predicting outcomes: as that article anticipated, the fair use analysis is anchored in the question of transformative use.45

Of course, as required by Supreme Court precedent, fair use analyses will continue to employ a case-by-case, fact-intensive inquiry. Generative AI may be considered more transformative; non-generative AI, less so. While books were at issue in Bartz and Kadrey, the fair use analysis may differ for copyrighted songs, images, or news articles. Because decisions are fact-specific, disposition of further cases or a concrete statement on this issue from the appellate level will be necessary to determine which approach becomes more widely adopted by federal trial courts.

Business Takeaways

For AI Developers

  • Two judges have now held that copying works to train LLMs is “transformative” under the fair use doctrine. Developers should be cautious, however, as these decisions are creatures of their particular facts—the law in this area will likely develop further.
  • Because judges have given great weight to the provenance of the work being used to train the LLM, developers should ensure that any works used for training are acquired legally. Piracy can lead to a rejection of fair use regardless of the strength of other factors.

For Copyright Holders

  • The two cases suggest that the first fair use factor will typically strongly favor defendants using copyrighted works for training generative AI models, so plaintiffs filing suits to protect their copyrighted works should ensure they have a strong argument regarding the fourth factor (market harm) supported by sufficient evidence.
  • Although it remains unsettled whether indirect market substitution or “market dilution” is a cognizable market harm under the fourth factor, copyright holders should pursue this theory of market harm when possible.

For Companies Using Third-Party AI Tools

  • Risk allocation in contracts concerning or contemplating AI models should be approached very carefully, and customers of companies offering AI services should ensure that they are properly indemnified from potential copyright infringement liability.
  • It is a best practice to avoid or limit the unauthorized use of copyrighted content in the training of any proprietary software that leverages AI, and to ensure lawful acquisition of copyrighted content when copying of copyrighted works is necessary.

Because fair use analysis is highly fact-specific and the contours of AI are rapidly and continuously shifting, companies should refrain from inferring too much precedential value from early trial-level decisions, and generally exercise caution around the use of copyrighted work for development of AI tools.

Manav Mathews, a summer associate in the Ropes & Gray Washington, DC office contributed to this article.

  1. Bartz et al. v. Anthropic PBC, No. C 24-05417 WHA (N.D. Cal June 23, 2025) (pending); Kadrey et al. v. Meta Platforms, Inc., No. 23-CV-03417 VC (N.D. Cal June 25, 2025) (pending).
  2. See Memorandum Opinion, Thomson Reuters Enterprise Centre GmbH et al v. ROSS Intelligence Inc., Docket No. 1:20-cv-00613, 17 (D. Del. Feb. 11, 2025) (indicating that a fair use defense would likely not shield AI companies from copyright infringement liability). The case is currently under interlocutory appeal.
  3. Id. at 2.
  4. Id. at 7.
  5. Id. at 1.
  6. See An End-of-Year Update to the Current State of AI-Related Copyright Litigation Ropes & Gray (December 17, 2024), https://www.ropesgray.com/en/insights/alerts/2024/12/an-end-of-year-update-to-the-current-state-of-ai-related-copyright-litigation.
  7. 17 U.S. Code § 107.
  8. Rich Stim, Measuring Fair Use: The Four Factors, Stanford Libraries, https://fairuse.stanford.edu/overview/fair-use/four-factors/#:~:text=the%20purpose%20and%20character%20of,use%20upon%20the%20potential%20market
  9. Bartz, Order on Fair Use, 9-11, 24.
  10. Id. at 30-31.
  11. Id. at 30.
  12. Id. at 14-15, 30-31.
  13. Id. at 30-31.
  14. Bartz Order on Fair Use, 28.
  15. Bartz, Order on Fair Use, 28.
  16. Id.
  17. Id. at 30-31.
  18. Id. at 18-19.
  19. Id. at 11, 14, 18-21, 30-31.
  20. Bartz Order on Fair Use, 31-32.
  21. Id. at 4, 11-12.
  22. Id. at 6 (quoting Harper & Row Publishers, Inc. v. Nation Enterprises, 471 U.S. 539, 566 (1985))
  23. Kadrey, Order Denying Plaintiffs’ Motion for Partial Summary Judgment and Granting Meta’s Cross-Motion for Partial Summary Judgment, 2-3.
  24. Id. at 16.
  25. Id. at 18-20, where Judge Chhabria notes libraries containing pirated works have been found liable for infringement, but that the plaintiffs in Kadrey have not provided evidence that Meta’s actions have “propped up” such libraries and therefore cannot benefit from findings against the libraries.
  26. Id. at 23, Bartz Order on Fair Use, 30-31.
  27. Kadrey, Order Denying Plaintiffs’ Motion, 23-24.
  28. Id. at 24-25.
  29. Id. at 26.
  30. Id.
  31. Id.
  32. Id. at 27-28.
  33. Id. at 28-32.
  34. For example, consider the difference between market harm caused to an author of a gardening series and the market harm caused to the author of a series like Harry Potter.
  35. Id.
  36. Id. at 32-36.
  37. Id. at 39-40.
  38. Id. at 5, 14, 38-39; Bartz, Order on Fair Use, at 8. Unlike Bartz, Kadrey involved individual plaintiffs, not a putative class; a class certification motion is pending in Bartz.
  39. Kadrey, Order Denying Plaintiffs’ Motion, 5.
  40. Id. at 31-33.
  41. Complaint, Thomson Reuters Enter. Ctr. GmbH v. ROSS Intel. Inc., No. 1:20-cv-00613-SB (D. Del. filed May 6, 2020).
  42. Memorandum Opinion, Thomson Reuters Enterprise Centre GmbH et al v. ROSS Intelligence Inc., Docket No. 1:20-cv-00613, 17 (D. Del. Feb. 11, 2025).
  43. See Does Training an AI Model Using Copyrighted Works Infringe the Owners’ Copyright? An Early Decision Says, “Yes.” Ropes & Gray (March 6, 2025), https://www.ropesgray.com/en/insights/alerts/2025/03/does-training-an-ai-model-using-copyrighted-works-infringe-the-owners-copyright.
  44. Memorandum Opinion, Thomson Reuters.
  45. See Pierre N. Leval, Toward a Fair Use Standard, 103 Harv. L. Rev. 1105 (1990); see also Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569 (1994). Note that Leval’s definition of “transformative use” is not limited to a question under the first factor, and may be intrinsically tied to other considerations, such as market harm.

DISCLAIMER: Because of the generality of this update, the information provided herein may not be applicable in all situations and should not be acted upon without specific legal advice based on particular situations. Attorney Advertising.

© Ropes & Gray LLP

Written by:

Ropes & Gray LLP
Contact
more
less

PUBLISH YOUR CONTENT ON JD SUPRA NOW

  • Increased visibility
  • Actionable analytics
  • Ongoing guidance

Ropes & Gray LLP on:

Reporters on Deadline

"My best business intelligence, in one easy email…"

Your first step to building a free, personalized, morning email brief covering pertinent authors and topics on JD Supra:
*By using the service, you signify your acceptance of JD Supra's Privacy Policy.
Custom Email Digest
- hide
- hide