From Books to Bots: Key Takeaways from the Anthropic Fair Use Decision for AI Developers and Copyright Holders

Ropes & Gray LLP
Contact

Ropes & Gray LLP

Introduction

On June 23, 2025, the United States District Court for the Northern District of California issued a significant order in Bartz, et al. v. Anthropic PBC, clarifying the application of the fair use doctrine to the use of copyrighted books in training large language models (LLMs). The decision provides important guidance for both AI developers and copyright holders, distinguishing between transformative uses for model training and unauthorized uses involving pirated or format-shifted works. This Alert summarizes the court’s principal findings and highlights the practical implications for AI developers and copyright holders alike.

Background

The plaintiffs, a group of published authors, brought a class action against Anthropic PBC, alleging unauthorized copying of their works for use in training Anthropic’s LLMs, such as its flagship Claude product. The claims focused on three practices: (1) using copyrighted works to train the LLMs, (2) digitizing books that were lawfully purchased in print form, and (3) acquiring pirated digital copies of books and retaining them to build a central digital library. Anthropic moved for summary judgment, asserting that its uses constituted fair use under Section 107 of the Copyright Act.

Key Findings

1. Use of Copyrighted Works for LLM Training—Transformative Fair Use

The court held that Anthropic’s use of copyrighted books for LLM training was “spectacularly” transformative and qualified as fair use. The training process involved analyzing statistical relationships between text fragments to enable the LLM to generate new, original text, rather than reproducing or distributing the original works. The court analogized this process to human learning, emphasizing that copyright law does not extend to the methods, concepts, or principles embodied in a work.

Crucially, the plaintiffs did not allege, nor did the record show, that Anthropic’s Claude LLM outputted infringing copies or substantial reproductions of the plaintiffs’ works. The absence of infringing outputs was central to the court’s fair use finding—indeed, the court expressly distinguished this case from others where the AI system’s outputs might themselves be infringing.

2. Digitization of Purchased Print Books—Format Shifting as Fair Use

The court also found Anthropic’s wide-scale digitization of print books to be fair use. A key consideration to the court’s conclusion was that Anthropic engaged in so-called destructive scanning of lawfully purchased print books to create digital copies for internal use—Anthropic lawfully purchased print books, stripped them of their bindings, and scanned the contents to create a digital library. In doing so, the new digital copy replaced the print original, which had been destroyed in the digitization process. The court found the format change from print to digital to be transformative because it facilitated storage and searchability without increasing the number of copies or distributing them outside the company.

Notably, the court distinguished this use from cases involving unauthorized distribution or multiplication of copies, analogizing it to permissible space-shifting or time-shifting uses recognized in prior cases as sufficiently transformative for fair use. Importantly, the court found that this format-shifting did not usurp any market reserved to the copyright owner, as Anthropic had lawfully acquired the print copies and did not distribute the digital versions externally.

3. Acquisition and Retention of Pirated Copies—No Fair Use

In contrast, the court held that Anthropic’s acquisition and retention of pirated copies to build a permanent, general-purpose digital library was not justified as fair use. The court rejected the argument that the eventual transformative use of some copies for LLM training could retroactively excuse the initial act of piracy. Obtaining and retaining pirated works for potential future uses, even where these pirated works might be used in the future for LLM training, was found to be a non-transformative use that directly displaced the market for authorized copies.

The court analogized this conduct to the unauthorized creation of a central library, which was not excused by the possibility of later transformative use. The acquisition of pirated copies was not reasonably necessary to the fair use of training LLMs, especially where lawful alternatives were available.

Practical Implications

For AI Developers

  • Lawful Acquisition Is Essential: The decision draws a clear line between the use of lawfully acquired works and pirated materials. AI developers must ensure that all training data is sourced through lawful means—by purchase, license, or reliance on public domain or open-licensed works. The acquisition and retention of pirated works, even for internal research or potential future use, exposes companies to liability and potential statutory damages, including for willfulness.
  • Transformative Use and Output Controls: The court’s reasoning affirms that training LLMs on lawfully acquired works is likely to be considered transformative, provided the outputs do not reproduce or closely mimic the original works. Developers should implement robust output filtering and monitoring to prevent infringing outputs, as the absence of such outputs was central to the court’s fair use finding.
  • Format Shifting for Internal Use: The conversion of purchased print books to digital format for internal, nondistributive use may be permissible under fair use, but companies should ensure that no additional copies are created or distributed beyond the original purchase. Maintaining clear records of acquisition and destruction of originals is advisable.
  • Recordkeeping and Transparency: The decision underscores the importance of transparency and recordkeeping. Deficiencies in documentation or cooperation in discovery may be held against the party asserting fair use.

For Copyright Holders

  • Limits of Fair Use in the AI Context: While the court recognized the transformative nature of LLM training, it also affirmed that copyright holders retain the right to control the initial acquisition and retention of their works. The unauthorized creation of digital libraries from pirated sources is not protected, and rightsholders may pursue damages for such uses.
  • Monitoring and Enforcement: Copyright holders should monitor the use of their works in AI training and be prepared to challenge unauthorized uses, particularly where pirated copies are involved or where outputs may cross the line into infringement.
  • Licensing Opportunities: The court acknowledged the possibility of an emerging market for licensing works for AI training but clarified that the Copyright Act does not guarantee a right to control all transformative uses. Rightsholders should consider proactive licensing strategies and clear terms for AI-related uses.

Conclusion

Though it is the first of what is likely to be many upcoming decisions addressing fair use and LLMs, the Anthropic decision provides a helpful framework for evaluating fair use in the context of AI training. It affirms the transformative nature of LLM development when based on lawfully acquired materials, while drawing clear boundaries against the unauthorized acquisition and retention of copyrighted works. AI developers should review data acquisition and training practices to ensure compliance with copyright law, and copyright holders should be aware of both the opportunities and limits of enforcement in the evolving landscape of AI and machine learning.

Manav Mathews, a summer associate in the Ropes & Gray Washington, DC office, and Michael MacKay, a summer associate in the Ropes & Gray Boston office, contributed to this article.

DISCLAIMER: Because of the generality of this update, the information provided herein may not be applicable in all situations and should not be acted upon without specific legal advice based on particular situations. Attorney Advertising.

© Ropes & Gray LLP

Written by:

Ropes & Gray LLP
Contact
more
less

PUBLISH YOUR CONTENT ON JD SUPRA NOW

  • Increased visibility
  • Actionable analytics
  • Ongoing guidance

Ropes & Gray LLP on:

Reporters on Deadline

"My best business intelligence, in one easy email…"

Your first step to building a free, personalized, morning email brief covering pertinent authors and topics on JD Supra:
*By using the service, you signify your acceptance of JD Supra's Privacy Policy.
Custom Email Digest
- hide
- hide