There is an easy answer to the question “How long does it take to write a book?” but it is not a true one. The easy answer is the duration between when you actually start putting your pen down on the first draft and when the book is finished. Maybe that’s two or three years. Maybe it’s 10. But the true answer is that it takes a lifetime to write a book.
Like creative art of any kind, the work is an accumulation of experiences. A book is not just words in a random order. It is made up of all the smells and sounds and emotions and people you’ve ever encountered. At their best, books provide a snapshot of what a person’s life looked like for a specific period of time, or at least how they felt. My most recent book, for example, has everything I read and thought and dreamed about for at least half a decade in it.
Writing the words down in a particular order and arranging them for publication is not the work of a book. The work is the life you live in the world. And because a book is a life, and a person’s life in particular and all of their effort, it’s especially egregious and evil for a bunch of wannabe billionaires to have stolen hundreds of thousands of books and fed them into their little machines so that they could create Large Language Models that quite literally encourage people to kill themselves.
This past Friday, the AI company Anthropic agreed to settle a landmark class-action lawsuit brought against it by book authors who claimed the company pirated their works. The settlement is reported to be $1.5 billion dollars, which comes out to about $3,000 for each of the approximately 500,000 stolen works. If the presiding judge approves the settlement (which could happen as early as today) this would be a landmark case for creative copyright, the first of its kind in the AI era.
Anthropic is the company behind the Claude chatbot, a competitor to ChatGPT. Claude was trained using books: In simple terms, engineers feed the text of books to Claude as the basis for the predictive calculations the chatbot makes when it assembles a response to a prompt. A court has ruled that this is fine. But some of the books that Claude was trained on were illegally downloaded from the file-sharing service Library Genesis (or LibGen), which is one of the largest repositories of stolen books in the world. Because this sucks in one thousand ways, a group of three authors brought a case against Anthropic that became a class-action lawsuit to represent all of the authors whose books were stolen, and Anthropic settled the case before it could reach a potentially damaging judgment.
Courts have ruled—though the AI companies had already taken the books and used them by the time of these rulings, and I personally find this a little dubious as well—that the AI companies could have used published books and papers to train their models if they had acquired those works by legal means. Had the companies purchased the books and fed them into their little stupid machines, judges have ruled that would be “fair use” because the chatbot’s output is a transformation of the original material. (Some AI firms have begun purchasing works, but not before they had committed crimes, and not without arguing that being expected not to steal things is unfair to them.) Instead, the companies downloaded millions of books illegally and used them to train their precious AIs.
To be clear, Anthropic is far from the only company to have done this. It is just the first company being forced into legal conversations with its victims. OpenAI and its partner Microsoft are being accused of stealing copyrighted material. Another case is being brought against Meta.
One and a half billion dollars sounds like a lot of money, but in proper scale it’s not. Just a week ago Anthropic raised $13 billion in a deal that valued the company at $183 billion; the settlement sum represents less than one percent of that valuation. In order to achieve that valuation Anthropic stole the work of many thousands of people, and then paid less than 1/100th of their valuation to simply make the whole thing go away—the settlement does not even require Anthropic to admit any wrongdoing. As a country we should not allow multibillion-dollar companies to commit theft at vast scale and then settle out of court for a negligible fraction of their ill-gotten gains.
The Washington Post estimates that as many as 500,000 books might be eligible for payment from this settlement, in which case payouts would come to about $3,000 per book stolen. For most of us $3,000 is a significant sum of money; there may be some authors who’d happily give over their work to train language models for that much. I personally would not. Is there an amount of money I would have agreed to? Sure, everyone has a price. But that is irrelevant here!
Stealing is a crime. It is theft. The perpetrators of the crimes are criminals. Sure, there are varied levels of crime but you do not get to settle out of court when you steal other things. If I walked into a bank today, and demanded a teller put a bunch of the bank’s money into my bag, I could not then use that money to buy lottery tickets, win, and then settle out of court for an agreement to give the bank its money back. Or at any rate that would not save me from being thrown into jail for robbing a bank. That’s just not how it works. It does not matter that the stolen books still exist and can be read, just as it would not matter if the money eventually returned to the bank. The problem is that you stole the money in the first place which is illegal and immoral and used it for your own personal gain.
“The lesson for A.I. developers is clear,” Kristian Stout, director of innovation policy at the International Center for Law and Economics told The New York Times: “Respect copyright in how data sets are acquired, and follow the example Anthropic itself has now committed to.” That example—which Anthropic is now setting only because a court ordered it—is purchasing the books they want to use to train their AI. The example they are setting is … following the laws of the country in which they practice business.
Absolutely not! The lesson these tech companies with their move fast and break things attitudes have been learning for decades is that the penalties for rampant lawbreaking will be cheaper and easier to bear than complying with the law would have been. This is not how a country should run. That is not how this country runs for most of us, when we break a law.
As an astute Bluesky user wrote:
This is the core of the issue with the entire ethos of moving fast and breaking things. Over and over again for several decades now, young men in Silicon Valley have made the calculation that breaking dozens of laws is worth it to make infinite money. Take what Mike Isaac wrote in his book Super Pumped: The Battle for Uber about Uber founder Travis Kalanick: “Kalanick viewed fines and tickets as just another cost of doing business,” and “Uber spared no expense on lobbying campaigns. The company regularly topped the list of biggest spenders […] throwing down tens of millions of dollars annually to sway lawmakers.” The laws effectively do not apply to these people, and whenever it seems like they might, the companies will throw around Monopoly money gifted to them by venture capital funds to simply change the laws.
“The [Anthropic] agreement is reminiscent of the early 2000s,” Cade Metz writes in the New York Times, “when courts ruled that file-sharing services like Napster and Grokster infringed on rights holders by allowing copyrighted songs, movies and other material to be shared free on the internet.” Perhaps it is reminiscent in terms of rights holders (artists) turning to the courts to get paid for their pilfered work, but these smarmy Silicon Valley idiots haven’t just stolen the art and given it away for free. Napster and Grokster pirated music—but the pirated Britney Spears song was still, when it reach the Napster user’s ears, the Britney Spears song. This is worse!
At least when a book or a song or a movie is pirated the person pirating it in theory consumes the art and maybe gets something out of it—the work itself retains its artistic value and meaning. This newer brand of thievery both rips off the author and churns their art into something evil. The AI companies have stolen our work not so that people might enjoy it illegally but so that the AI companies themselves can grind it into profitable sludge.
It is an educated guess that at least one of my books was stolen by these companies to train their machines. Both of my books appear in LibGen, the pirated books database that Anthropic used to train its AI. “Authors are celebrating a ‘historic’ settlement,” Ars Technica wrote. Not me. I am not fucking celebrating. In fact, I am more furious than I was before the settlement!
This of course is part of the problem with a class action lawsuit: Not everyone represented by the class will be happy with the outcome. It’s possible my book isn’t even one of the ones Anthropic stole. But this isn’t about wanting a bigger payout. I do not care about how much money I get in a class action settlement with a bunch of evil Silicon Valley companies because I do not ever want to settle with them. I want them to lose gargantuan sums in legal fees for years fighting for a version of reality in which they aren’t the bad guys, in which they didn’t do blatantly illegal and immoral things in order to make quick money with their new machines that have not changed the world for the positive in any way so far.
“Because the library of books amassed by Anthropic was thought to contain approximately seven million works,” Kate Knibbs writes in Wired, “the AI company was potentially facing court-imposed penalties amounting to billions, possibly more than $1 trillion dollars.” That is what I would have preferred, as an author, as someone who makes creative work. Sure, that might mean that people don’t get a payout immediately, and that the case could be tied up in courts forever as the parties file alternating appeals. This is obviously not to the benefit of plaintiffs’ lawyers looking to get paid sooner, but it is to the benefit of prevention. The consequences for theft should include that the profit you made off of the theft is taken away. The cost of building a company on theft should be the company itself. It should be everything you worked for because really, you didn’t work for it: You stole it.
Source link