In the past few months, the administration, the Copyright Office, and the courts have weighed in on several material issues at the cross section of copyright law and AI. The White House’s recent announcement of its AI Action Plan offers an opportunity to examine the interesting alignment and discord on key issues relating to fair use.
Specifically, this article dissects three key issues and how they are being considered in the evaluation of fair use: the use of pirated works for training AI models; the “dilution” theory of market harm; and whether legislation and regulation are necessary. While there are clear points of divergence between the White House, the Copyright Office, and the courts, the areas of alignment provide a foundational framework for stakeholders to navigate today’s landscape while also preparing for tomorrow’s inevitable changes.
I. Copyright Office: AI Report
Since July 2024, the Copyright Office has issued a Copyright and Artificial Intelligence Report in three parts, examining critical copyright law and policy issues raised by the development and use of AI. Part 3 of the Copyright Office Report (“Report”), which was pre-published in May 2025, focused on the use of copyrighted works in the training and development of generative AI systems, including whether the fair use doctrine applies to the use of copyrighted works to train generative AI tools.
Although the Report concludes that an ultimate fair use determination depends on the facts and circumstances of each case, it extensively evaluates the fair use factors, opining on the impact of using pirated works as training data, the “dilution” type of market harm, and whether there is a need for legislation or regulation.
a. Use of Pirated Works Is Relevant to Fair Use Factors 1 and 4 and Makes Fair Use Less Likely
In evaluating the fair use factors, Part 3 of the Report discusses using pirated versions of copyrighted works as training data. The Copyright Office concludes that while a determination of fair use will depend on the facts of each case, the use of pirated works will make a finding of fair use less likely:
On one end of the spectrum, uses for purposes of noncommercial research or analysis that do not enable portions of the works to be reproduced in the outputs are likely to be fair. On the other end, the copying of expressive works from pirate sources in order to generate unrestricted content that competes in the marketplace, when licensing is reasonably available, is unlikely to qualify as fair use.
As to the first fair use factor, the character of the use, the Copyright Office explains that it agrees with the comments submitted by stakeholders that whether the AI developer had lawful access to the works used in training should be considered, stating: “In the Office’s view, the knowing use of a dataset that consists of pirated or illegally accessed works should weigh against fair use without being determinative” because “[g]aining unlawful access [] bears on the character of the use.”
As to the fourth factor, potential market harm, the Copyright Office opines that the “use of pirated collections of copyrighted works to build a training library, or the distribution of such a library to the public, would harm the market for access to those works” and would thus again weigh against a finding of fair use.
b. Endorsing the Theory of Market Harm by Dilution
Part 3 of the Report dedicates a section of its analysis of fair use factor 4 to the “Market Dilution” theory, highlighting a tension among stakeholder comments: “[a] number of commenters contended that courts should consider the harms caused where a generative AI model’s outputs, even if not substantially similar to a specific copyrighted work, compete in the market for that type of work” but that “[o]ther commenters argued that the fourth factor analysis considers only harm to markets for the specific copyrighted work.”
The Copyright Office goes on to endorse the market dilution theory, noting that while this is “uncharted territory,” “[t]he statute on its face encompasses any ‘effect’ upon the potential market” and “[t]he speed and scale at which AI systems generate content pose a serious risk of diluting markets for works of the same kind as in their training data.” The Copyright Office opines that AI-generated works could saturate the market, leading to fewer human-authored works being purchased, and specifically for music, diluting established royalties. It argues that the “threat is more acute” when the styles of creators are copied “because of the technology’s ability to produce works so similar in style ‘that the average person cannot discern a difference in the marketplace[,] . . . creat[ing] direct competition with the creators whose works have been used to train the model.’” (quoting the Writer’s Guild of America’s submission).
c. Legislation and Regulation Are “Premature”
In advance of issuing the Report, the Copyright Office also solicited comment on statutory or regulatory approaches that have been adopted in other countries that could be used in the U.S. The Copyright Office discusses these developments in the international community but notes the lack of consensus in stakeholder support for any statutory change in the U.S. other than the benefits of harmonization. Thus, the Copyright Office shares that government intervention would be “premature at this time” and concludes that voluntary license markets would instead be the most appropriate way to serve the interests of both the creative and technology industries, concluding:
In our view, American leadership in the AI space would best be furthered by supporting both of these world-class industries that contribute so much to our economic and cultural advancement. Effective licensing options can ensure that innovation continues to advance without undermining intellectual property rights. These groundbreaking technologies should benefit both the innovators who design them and the creators whose content fuels them, as well as the general public.
II. Executive Branch: The White House’s AI Action Plan
On July 23, 2025, the White House released its AI Action Plan (“the Plan”), which includes over 90 policy recommendations with the stated goals of accelerating innovation, building out AI infrastructure, and enhancing global partnerships and security. In conjunction with the Plan, the White House issued executive orders implementing elements of the plan. While the Plan and related executive orders did not expressly discuss copyright law, President Trump’s accompanying speech announcing the Plan touched briefly on intellectual property, stating that “what we really need to be successful is a very simple phrase called common sense, and that begins with a common-sense application of artificial and intellectual property rules” and providing some insight into the administration’s views regarding copyright law and the fair use framework.
a. Licensing Not Feasible, but the Use of Pirated Works May Be Disfavored
President Trump’s announcement referenced considerations surrounding the use of copyrighted content for training, taking the position that licensing does not seem feasible:
You can’t be expected to have a successful AI program when every single article, book, or anything else that you’ve read or studied, you’re supposed to pay for. Gee, I read a book, I’m supposed to pay somebody. And you know we appreciate that, but you just can’t do it because it’s not doable. And if you’re going to try and do that, you’re not going to have a successful program. I think most of the people in the room know what I mean. When a person reads a book or an article, you’ve gained great knowledge. That does not mean that you’re violating copyright laws or have to make deals with every content provider. And that’s a big thing that you’re working on right now. I know, but you just can’t do it.
He also made statements that echoed the arguments that several AI developers made in their stakeholder comments to the administration for consideration in developing the Plan. Specifically, that to license would put American developers at a disadvantage when compared with China:
China is not doing it [licensing]. And if you’re going to be beating China, and right now we’re leading China very substantially in AI, very, very substantially. And nobody’s seen the [] amount of work that’s going to be bursting upon the scene, but you have to be able to play by the same set of rules.
However, he also signaled that use beyond training would be a step too far:
Of course, you can’t copy or plagiarize an article, but if you read an article and learn from it, we have to allow AI to use that pool of knowledge without going through the complexity of contract negotiations, of which there would be thousands for every time we use AI.
This last statement suggests that the administration would consider the outputof an AI tool that was a copy not to be a fair use. Furthermore, a broad reading of the statement suggests that the administration may disfavor the use of pirated libraries—which comprise copied and plagiarized works—as content for training an AI tool.
b. Legislation and Regulation of AI Should Be at the Federal Level
The White House was unequivocal that AI should be regulated by “federal rule and regulation” and “not 50 different states regulating this industry of the future.” While the Plan generally decries overregulation, President Trump stated in his speech: “We need one common sense federal standard that supersedes all states, supersedes everybody, so you don’t end up in litigation with 43 states at one time. You got to go litigation-free. It’s the only way.”
III. Judicial Branch: Federal District Court Fair Use Opinions
There have been two recent federal court decisions from the Northern District of California relating to the use of copyrighted works to train generative AI tools: Kadrey, et al. v. Meta Platforms, Inc., N.D. Cal. Case No. 23-cv-0317-VC (“Meta”) and Bartz et al. v. Anthropic PBC, N.D. Cal. Case No. 24-cv-05417 (“Anthropic”). Although both found in favor of copyright fair use, each had a somewhat different approach to the issues of pirated works and dilution theory of market harm in their evaluation.
a. Use of Pirated Works as Training Data Is Relevant, but Courts Split on Whether Dispositive
More detailed analyses of these decisions can be found here, and here.
In the Anthropic decision, Judge Alsup differentiated between the use of pirated versus authorized copies in the creation of the central library that was used to train the AI tool. The creation of “a permanent, general-purpose library was not itself a fair use excusing Anthropic’s piracy” because “Anthropic had no entitlement to use pirated copies for its central library.” Moreover, he opined, Anthropic was never entitled to create or hold copies of the pirated works, meaning that “almost any unauthorized copying would have been too much.”
On the other hand, in the Meta decision, Judge Chhabria rejected plaintiffs’ argument that the use of “shadow libraries” for training was dispositive that there was no fair use. Instead, while the source of the training material could be relevant to a number of inquiries—whether the “use” of the copyrighted works was in bad faith, if there was evidence that the defendant’s use of shadow libraries encouraged infringers who created these libraries to continue their copying, or could be relevant to the potential market harm inquiry—it was not dispositive.
b. Market Dilution Theory of Harm Embraced by One Court and Rejected by the Other
In Meta, Judge Chhabria lauded the market dilution theory of harm when considering the fourth fair use factor, aligning with the approach endorsed by the Copyright Office. Because of the capability of AI technology to “generate literally millions of secondary works” at a fraction of the time and creativity needed to create the underlying works, the court stated that it was “highly relevant” to consider whether the outputs of the AI system could serve as an indirect replacement of plaintiffs’ works. Ultimately, the court determined that market dilution theory did not create a triable issue in that case because of the insufficiency of pleading and proof by the plaintiffs.
In contrast, in Anthropic, Judge Alsup summarily rejected the plaintiffs’ market dilution theory of harm on the basis that it was “not the kind of competitive or creative displacement that concerns the Copyright Act.”
IV. Points of Alignment Provide Some Guidance
In recent months, we have received guidance on pivotal issues relating to fair use and AI regulation. Although there are clear points of disagreement between the approaches endorsed by the White House, the Copyright Office, and the courts, the points of alignment provide companies with some direction.
In sum, as to use of pirated works for training AI, there is fairly good alignment: the Copyright Office opined and one court has found that use of “shadow libraries” or pirated works weighs against fair use; another court found that it was relevant even if not dispositive to fair use, and the administration hinted that it may follow that view.
As to the “market dilution” theory of financial harm and the evaluation of the fourth fair use factor, there is less consensus: the Copyright Office supports the theory; one court has found in favor of this theory, while another court rejected the theory; and the administration has not stated a clear position.
As to whether new legislation is needed to address the use of copyrighted works for training AI, the status quo and the market may well supersede any legislation, given weak consensus: the Copyright Office suggested that federal legislation is premature and that voluntary free market licensing is the preferred solution; the administration firmly stated that federal legislation—as opposed to state legislation—is appropriate, but did not indicate imminent legislation related to copyright or fair use; and meanwhile, many stakeholders, including some of the largest AI developers, are entering licensing arrangements with content creators.
Indeed, there appears to be a general consensus among stakeholders supporting a free-market approach to licensing content for AI tool development. This is occurring in real time in the marketplace, as reflected by a growing licensing market. Many AI developers are exploring scalable licensing mechanisms, which will enable them to stay competitive, ensure access to the most valuable content (and arguably ensure continued creation of such content), reduce the risks associated with using pirated works, and reduce the risks of non-copyright data-scraping claims such as breach of contract/terms of service, unjust enrichment, and trespass to chattels, which are emerging in litigation.
Many AI developers are also implementing guardrails in generative AI tools to prevent output that directly replicates its training data or known intellectual property. Such measures seemingly align with the shared interest across the administration, the Copyright Office, and the courts in solutions that promote innovation while protecting IP owners.
V. Looking Forward
Like the changing nature of AI technology itself, there will be material developments in the coming months. The federal court rulings on fair use discussed above are in the process of being, or likely to be, appealed, and there are numerous other copyright infringement AI cases pending in the district courts. Until these appeals conclude, issues regarding use of pirated works and dilution theory of market harm will continue to be district- and fact-specific. Furthermore, the White House’s AI Policy Plan had over 90 policies it sought to implement, only a limited number of which were addressed in the related executive orders. Thus, although there is a broad framework to the administration’s approach to AI, actual implementation remains unclear. Given this state of flux, stakeholders should remain vigilant and monitor developments across the legal and regulatory landscape, with a particular focus on points of alignment between the positions taken by the White House, the Copyright Office, and the courts. While business decisions based on the status quo may need to be adjusted, those based on the aligned positions and market trends will be on firmer ground.