Within the same week, two judges in the Northern District of California issued groundbreaking summary judgment rulings regarding whether an artificial intelligence company’s scraping and ingestion of copyrighted works to train its LLMs[1] qualified as fair use. Both decisions carry potentially seismic importance for the tech industry and intellectual property litigators.
Overview of the Rulings in the Anthropic and Meta Lawsuits
The first important recent ruling arose in a proposed class action of book authors who alleged that Anthropic violated their copyrights in building its LLM, Claude, in a case captioned Bartz, et. al. v. Anthropic PBC, Case No. C-24-05417 (N.D. Cal.). According to the Complaint, Anthropic had built its library of training materials in part by purchasing copyrighted works in printed hard copy form, then digitizing them by scanning their pages and storing the digital versions. Anthropic also had allegedly intentionally downloaded copyrighted works directly from pirate websites. It then trained Claude on this library of collected works. The authors contended that Anthropic’s use of their copyrighted works without the authors’ permission constituted copyright infringement.
Anthropic moved for summary judgment on its fair use defense, which is an affirmative defense available to defendants accused of copyright infringement. To decide whether use of a copyrighted work is “fair use,” courts look at four factors:
- The purpose and character of the use, including whether the use is for commercial or nonprofit educational purposes;
- The nature of the copyrighted work;
- The amount and substantiality of the portion of the work that is used;
- The effect of the use upon the potential market for or value of the copyrighted work.
In his June 23, 2025 ruling granting Anthopic’s motion and finding fair use (ruling available here), Judge Alsup first separated Anthropic’s “uses” of training the LLM versus building a central library of content. Assessing the first prong of the fair use test, he ruled that the use of the copyrighted works to train Claude “was exceedingly transformative and was a fair use.” Anthropic Order at 9. As to the other use—digitizing the works to create a central library—Judge Alsup again found fair use because this involved simply putting the works in a more convenient format “without adding new copies, creating new works, or redistributing existing copies.” Order at 9. In a win for the authors, however, Judge Alsup criticized Anthropic for downloading the copyrighted works from pirate websites and permanently storing them in Anthropic’s central library, ruling that this was not fair use and was “inherently, irredeemably infringing.” Order at 19.
Just two days later, a different Northern District judge (Judge Chhabria) issued his own summary judgment ruling in a similar case brought by famous fiction writers against Meta in which the authors alleged that Meta had downloaded their books from online “shadow libraries” to train the company’s Llama AI tool. See Order in Kadrey, et al. v. Meta Platforms, Inc. Case No. 23-cv-03417 (N.D. Cal.) (order available here). In this case, both sides moved for summary judgment on fair use, so Judge Chhabria grappled with the same four factors. While Judge Alsup focused more on the first factor, Judge Chhabria emphasized the fourth factor of market impact because an AI could create cheaper expressive works and reduce human beings’ incentive to create. He criticized Judge Alsup’s “inapt analogy” that compared AI companies using books to train a model with a teacher using books to teach students; unlike a teacher’s students, the AI can “generate countless competing works with a miniscule fraction of the time and creativity it would otherwise take.” Meta Order at 3. He did, however, agree with Judge Alsup that “there’s no disputing that” “the companies’ use of the works is transformative.” Meta Order at 3. Judge Chhabria reluctantly granted summary judgment of fair use to Meta, writing that the Court “had no choice” but to do so and at the same time cautioning that “the consequences of this ruling are limited.” Meta Order at 5. The Court later emphasized that the ruling was “dictated by the choice the plaintiffs made to put forward two flawed theories of market harm while failing to present meaningful evidence on the effect of training LLMs like Llama with their books on the market for those books.” Id. at 36.
The similarities between the cases are apparent. Both involved book authors suing AI companies for using their copyrighted works to train their LLMs. In both cases, the AI companies had initially attempted licensing the copyrighted works, but opted for other methods when that proved too impractical—including downloading copyrighted works from pirate websites without authorization from the authors (at least for some of the works). And in both cases, the judges analyzed the fair use factors and found fair use. But, as discussed below, the rulings differ in important respects in their tone, outcomes, and what they might mean for the future of fair use in the AI context.
Comparing the Courts’ Fair Use Analyses
Factor One: The purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes.
From the start of the analysis, the Courts took different approaches to how the “uses” should be analyzed. Judge Alsup separated the uses of (1) training the LLM and (2) building a library of copyrighted texts, rejecting Anthropic’s argument that they were all part and parcel of the same use. Anthropic Order at 18. Judge Chhabria, on the other hand, considered these same two “uses” as part of a single analysis, stating that plaintiffs were wrong to ask them to be considered separately because downloading the books had to be considered “in light of its ultimate, highly transformative purpose: training Llama.” Meta Order at 21.
Regardless, the judges agreed that using copyrighted works to train AI systems was “transformative,” and it wasn’t even a close call. Judge Alsup wrote that “the ‘purpose and character’ of using works to train LLMs was transformative—spectacularly so.” Anthropic Order at 11; see also id. at 13 (“quintessentially transformative”), 30 (“the technology at issue was among the most transformative we will see in our lifetimes”). Judge Chhabria agreed that the AI company’s use of the works “had a ‘further purpose’ and ‘different character’ than the books—that it was highly transformative.” Meta Order at 16. Still, Judge Chhabria explained that the transformative purpose was not the full fair use analysis, nor even the full factor one analysis, and other considerations should be taken into account. Besides the fact that “Meta’s copying has the potential to exponentially multiply creative expression in a way that teaching individual people does not,” he also cautioned that the commercial nature of the copying was relevant and should not be brushed aside. Meta Order at 17-18. Still, these factors weighing against fair use were not significant enough to overcome the highly transformative nature of AI. And regarding the downloading from shadow libraries, “[b]ecause Meta’s ultimate use of the plaintiffs’ books was transformative, so too was Meta’s downloading of those books.” Meta Order at 21. This was why, although Meta’s use of the shadow libraries could be relevant to bad faith and perpetuating unauthorized distribution, it was not enough to lean against fair use at the factor one analysis.
Judge Alsup, in analyzing Anthropic’s building of its library as a separate use, drew a distinction between the works Anthropic purchased and those it pirated. He ruled that the retention of purchased works for later uses and changing them from physical to digital versions in order to improve storage and searchability was a fair use, particularly because Anthropic destroyed the original print version and did not share the new digital copy with outsiders. Order at 16. But on the pirated copies, he took a different tack: “[T]he person who copies the textbook from a pirate site has infringed already, full stop,” he stated, noting that the copies were otherwise available for purchase. Order at 18. The use of these pirated copies rather than paid versions to build a library was an improper use on its own terms, regardless of later possible training uses. Where the “piracy was the point,” and where Anthropic downloaded the full-text copies to be maintained forever, Anthropic’s arguments for fair use failed. Order at 20.
The differences in the Courts’ two approaches on this seemingly simple question—are downloading and training a single use, or two?—is emblematic of the challenge that litigants and courts have found in trying to map AI technology to traditional copyright infringement case law. Whether these acts are separated or joined together in the analysis has potentially large implications for how courts address this issue in the future. If building the library is a use standing by itself, the argument for transformative use is much weaker.
Factor Two: The nature of the copyrighted work.
Both judges devoted little attention to factor two, which they agreed weighed against fair use because the authors’ copyrighted works were creative original works of expression that copyright was designed to protect. Judge Chhabria rejected Meta’s argument that it only copied the books to extract non-expressive information (the “statistical relationships” between words) because those statistical relationships were indeed the product of expressive information in the creative work, and Meta copied them for their high-quality expression to create high-quality training data. Meta Order at 24. Judge Alsup explained that factor two is mainly used to assess the other factors, and Judge Chhabria agreed that the second factor “doesn’t mean much for the analysis as a whole.” Meta Order at 24.
Factor Three: The amount and substantiality of the portion used in relation to the copyrighted work as a whole.
The judges came to similar conclusions here regarding training the LLM. First, both pointed out that the relevant question is not the amount of copyrighted material used by the copier, but the amount of copyrighted material the copier makes available to the public in the purported transformative use. Here, Judge Alsup ruled that all copying of entire books was necessary for the transformative use. Anthropic needed billions of words to train an LLM, and AI outputs accessible to the public were not at issue, so this factor weighed in favor of fair use. Anthropic Order at 25. Regarding the Authors’ argument that Anthropic did not need to use their specific works to train the model, Judge Alsup found that “[b]ecause using so many works was reasonably necessary, using any one work for actually training LLMs was about as reasonable as the next.” Anthropic Order at 26. Judge Chhabria similarly concluded that using the whole book to train an AI that needs large volumes of high-quality data is reasonably necessary.
Judge Alsup also examined the other Anthropic use of the copyrighted works in building a library, and again distinguished between the purchased and pirated copies. For the purchased copies, he found fair use because “[c]opying the entire work was exactly what this purpose required,” “[t]here was no surplus copying,” and “[t]he source copy was destroyed.” Anthropic Order at 27. However, for the pirated copies, he ruled that almost any copying would be too much because Anthropic had no entitlement to them at all, which pointed against fair use.
Factor Four: The effect of the use upon the potential market for or value of the copyrighted work.
The market effect factor reveals the biggest difference between the judges’ view of these disputes. For Judge Alsup, the copies used to train LLMs “did not and will not displace demand for copies of the Authors’ works.” Anthropic Order at 28. He argued that AI and LLM products that could result in an explosion of works competing with original works was no different than schoolchildren learning how to write better by reading books. Furthermore, “the Act seeks to advance original works of authorship, not to protect authors against competition.” Anthropic Order at 28. For the copies used to build a central library, again he drew the “purchased vs. pirated” distinction. For the purchased hard copies that Anthropic digitized, because it was just a format change and there was no market displacement by buying print rather than electronic versions, this factor was neutral. But for the pirated copies that “plainly displaced demand for Authors’ books—copy for copy,” the factor weighed against fair use. Anthropic Order at 29-30.
Judge Chhabria emphasized the importance of factor four much more than Judge Alsup. However, he acknowledged that because Meta’s use was so transformative at factor one, the market impact had to be strongly against fair use to carry the day for the entire analysis. He identified three ways that the plaintiff could argue market impact in this context: (1) outputs that are substantially similar; (2) harm to the market for licensing; and (3) outputs that are “similar enough” that they will compete with originals and substitute for them. Order at 26. Here, option (1) failed because Llama did not generate any meaningful portion of the books, and option (2) failed because the argument was circular—the harm from lost licensing fees would always apply to fair use analysis. For the third “far more promising” option that AI would cause “market dilution” by creating works using a fraction of the time and creativity, Judge Chhabria discussed it at length. He explained that although LLMs might not harm famous authors like Agatha Christie, they could prevent the emergence of the next Agatha Christie. Meta Order at 29. This “indirect substitution” for vaguely similar works, while normally negligible when comparing an original work with a single secondary work, was different here. “This case is different,” he wrote, because it “involves a technology that can generate literally millions of secondary works, with a miniscule fraction of the time and creativity used to create the original works it was trained on.” Meta Order at 32. Despite these misgivings, he found that the authors had not put forward enough evidence to create an issue of fact sufficient to defeat summary judgment on that “market dilution” theory. But his ruling expressed discomfort with this outcome, noting that his “conclusion may be in significant tension with reality.” Meta Order at 36.
What (Might) Come Next
There are too many takeaways from these important opinions to count, so we’ll limit ourselves to five observations on what might come next:
- Using copyrighted works to train AI LLMs is “transformative”: Both judges were unequivocal on the transformative purpose of training AI tools with copyrighted works. Judge Alsup emphasized it was “among the most transformative many of us will see in our lifetimes” and Judge Chhabria stated that “there’s no disputing” its transformative nature. Academics and AI companies have been pushing the courts and the Copyright Office to recognize the transformative nature of AI for a long time.[2] These two rulings from respected and highly analytical judges suggest that courts are likely to agree.
- AI outputs are a different case: AI copyright issues involve both what is inputted to the LLM to train the model and what is outputted by the LLM in response to user prompts. But both these sets of allegations and rulings focused on the inputs. Judge Alsup noted that if the allegations involved infringing outputs, “this would be a different case.” Anthropic Order at 12, 28. He all but invited plaintiffs to bring such a case, reminding Authors that they “remain free to bring that case in the future.” Anthropic Order at 28. (The recent complaint filed by Disney and Universal against Midjourney in C.D. Cal. for allegedly generating infringing images of their copyrighted characters is one such example of an output-focused case.) Given the difficulty that the plaintiffs in these cases faced in articulating harm to the market, pursuing cases based on output may be the next step for plaintiffs.
- Pirating copyrighted works to train an AI is an open question: The two courts decided the question of how to deal with pirated works differently. While Judge Alsup found downloading the pirated copies to build a library was itself an inherently infringing use “full stop,” Judge Chhabria said this fact was “not an automatic win” for Plaintiffs because it had to be considered “in light of its ultimate, highly transformative purpose” of training the AI, and found the transformative nature outweighed this fact. One could imagine plaintiffs picking up Judge Alsup’s view and pursuing claims based solely on pirated works.
- Factor four might be the crux: The two courts both found that factor four favored fair use, but examined it differently. Judge Alsup found there was no market displacement for the authors’ works (for the training use), opining that the creation of more works is simply competition which the Copyright Act is not designed to prevent. Judge Chhabria focused instead on “market dilution” and, although he could not find for plaintiffs in this case, he stated that “it seems likely that market dilution will often cause plaintiffs to decisively win the fourth factor—and thus win the fair use question overall—in cases like this.” Meta Order at 32. It seems likely that future plaintiffs will sharpen their market dilution theories in an effort to overcome the highly transformative nature of this technology.
- This is not the end of either story: Neither of these rulings disposes of the entire case. The Anthropic case will go forward on the issue of the pirated copies used to create Anthropic’s central library, and there will be further litigation regarding the library copies for uses other than for training LLMs. Anthropic Order at 31. The Meta case still involves the issue of plaintiffs’ separate claim that Meta unlawfully distributed their works during the downloading process (a byproduct of “torrenting”). While this certainly was not the core conduct that plaintiffs were aiming at when they filed suit, this is a good illustration of the difficulty we discussed above—when does one step in the chain of creating an LLM constitute a separate “use,” and when is it just part of the overall process?
Finally, these cases are a good reminder that the fair use doctrine is adaptable. Both judges applied the traditional fair use factors to this transformative new technology, but they analyzed similar fact patterns in different ways—highlighting the subjectivity and fluidity of the fair use doctrine. While we cannot know exactly what comes next, it seems very likely that these lengthy and thoughtful rulings will form the groundwork of judicial analysis and litigation in the coming years.
[1] LLM stands for Large Language Model, a type of AI system trained on large volumes of text to measure statistical probabilities of words and produce understandable responses to user prompts.
[2] See, e.g., the May 2025 “Copyright and Artificial Intelligence” report issued by the Copyright Office, available HERE [https://www.copyright.gov/ai/Copyright-and-Artificial-Intelligence-Part-3-Generative-AI-Training-Report-Pre-Publication-Version.pdf].