A billion dollars isn’t what it used to be—but it still focuses the mind. At least it did for me when I heard that the AI company Anthropic agreed to an at least $1.5 billion settlement for authors and publishers whose books were used to train an early version of its large language model, Claude. This came after a judge issued a summary judgement that it had pirated the books it used. The proposed agreement—which is still under scrutiny by the wary judge—would reportedly grant authors a minimum $3,000 per book. I’ve written eight and my wife has notched five. We are talking bathroom-renovation dollars here!
Since the settlement is based on pirated books, it doesn’t really address the big issue of whether it’s OK for AI companies to train their models on copyrighted works. But it’s significant that real money is involved. Previously the argument over AI copyright was based on legal, moral, and even political hypotheticals. Now that things are getting real, it’s time to tackle the fundamental issue: Since elite AI depends on book content, is it fair for companies to build trillion-dollar businesses without paying authors?
Legalities aside, I have been struggling with the issue. But now that we’re moving from the courthouse to the checkbook, the film has fallen from my eyes. I deserve those dollars! Paying authors feels like the right thing to do. Despite the powerful forces (including US president Donald Trump) arguing otherwise.
Fine-Print Disclaimer
Before I go farther, let me drop a whopper of a disclaimer. As I mentioned, I’m an author myself, and stand to gain or lose from the outcome of this argument. I’m also on the council of the Author’s Guild, which is a strong advocate for authors and is suing OpenAI and Microsoft for including authors’ works in their training runs. (Because I cover tech companies, I abstain on votes involving litigation with those firms.) Obviously, I’m speaking for myself today.
In the past, I’ve been a secret outlier on the council, genuinely torn on the issue of whether companies have the right to train their models on legally purchased books. The argument that humanity is building a vast compendium of human knowledge genuinely resonates with me. When I interviewed the artist Grimes in 2023, she expressed enthusiasm over being a contributor to this experiment: “Oh, sick, I might get to live forever!” she said. That vibed with me, too. Spreading my consciousness widely is a big reason I love what I do.
But embedding a book inside a large language model built by a giant corporation is something different. Keep in mind that books are arguably the most valuable corpus that an AI model can ingest. Their length and coherency are unique tutors of human thought. The subjects they cover are vast and comprehensive. They are much more reliable than social media and provide a deeper understanding than news articles. I would venture to say that without books, large language models would be immeasurably weaker.
So one might argue that OpenAI, Google, Meta, Anthropic and the rest should pay handsomely for access to books. Late last month, at that shameful White House tech dinner, CEOs took turns impressing Donald Trump with the insane sums they were allegedly investing in US-based data centers to meet AI’s computation demands. Apple promised $600 billion, and Meta said it would match that amount. OpenAI is part of a $500 billion joint venture called Stargate. Compared to those numbers, that $1.5 billion that Anthropic, as part of the settlement, agreed to distribute to authors and publishers as part of the infringement case doesn’t sound so impressive.
Unfair Use
Nonetheless, it could well be that the law is on the side of those companies. Copyright law allows for something called “fair use,” which permits the uncompensated exploitation of books and articles based on several criteria, one of which is whether the use is “transformational”—meaning that it builds on the book’s content in an innovative manner that doesn’t compete with the original product. The judge in charge of the Anthropic infringement case has ruled that using legally obtained books in training is indeed protected by fair use. Determining this is an awkward exercise, since we are dealing with legal yardsticks drawn before the internet—let alone AI.
Obviously, there needs to be a solution based on contemporary circumstances. The White House’s AI Action Plan announced this May didn’t offer one. But in his remarks about the plan, Trump weighed in on the issue. In his view, authors shouldn’t be paid—because it’s too hard to set up a system that would pay them fairly. “You can’t be expected to have a successful AI program when every single article, book, or anything else that you’ve read or studied, you’re supposed to pay for,” Trump said. “We appreciate that, but just can’t do it—because it’s not doable.” (An administration source told me this week that the statement “sets the tone” for official policy.)
The “too hard to implement” argument is absurd. The overlords of AI constantly boast that their products are going to solve the deep mysteries of the universe. Surely they can handle what is essentially a bookkeeping challenge. We’ve managed to pull off the more difficult task of accounting for creator rights in the music industry, where an elaborate system of tracking helps identify when and where songs are played. “All we need to do is put in place a collective licensing system, and we have come up with several different proposals for this,” says Authors Guild CEO Mary Rasenberger. “The AI companies aren’t biting because it might undermine their fair use argument.”
Nonetheless, that bogus argument about the difficulty of paying authors is a critical part of a new justification for not paying authors. This argument stipulates that the survival of the United States depends on beating China in AI. Since book content is crucial in building elite AI, paying authors is an unaffordable distraction. It’s a national security issue! When the judge in the Anthropic case ruled that training LLMs with books was fair use, AI czar David Sacks applauded the decision. “China is going to train on all the data regardless, so without fair use, the US would lose the AI race,” he said. So now, we’re supposed to emulate China in protecting creative rights?
National security is also the underpinning of an unsolicited “solution” to the problem offered by Stewart Baker, a former general counsel of the NSA. He recently posted that Trump should invoke a wartime provision called the Defense Protection Act, which allows the government to take over businesses in times of war, to justify the unapproved use of copyrighted books to train AI models. Authors would be entitled to no more than the royalties they might receive from a single book. “If the companies bought the book there would be no damage [to the author],” he told me. Explain that to someone like Robert Caro, who spent years researching the US Senate for Book 3 of his Lyndon Johnson biography. His magisterial account of that institution likely informs what we see in LLMs, and millions of paid users benefit. Enjoy that pumpkin latte bought with your compensation, Bob!
Of course, even if authors do get four- or even five-figure sums for the use of their books to train AI, that would not address the most serious problem of all—the fact that people aren’t reading. In my years covering the tech business I am often shocked at the disregard some leaders have for books and authors. Last April, Jack Dorsey tweeted “Delete all IP law.” Elon Musk replied “I agree.” When I did my book about Google, Sergey Brin told me how antiquated an idea it was to spend so much time writing a long narrative when questions about the company might best be answered through a search engine. Now, I imagine, he and his colleagues would say that an appetite for deep learning can be more than satisfied by responses to AI prompts. I defy you to curl up with one.
This is an edition of Steven Levy’s Backchannel newsletter. Read previous newsletters here.