AI

Thousands of authors sign letter urging AI makers to stop stealing books

Comment

Laser Scanning Opened Book in Deep Blue Tone.
Image Credits: MirageC (opens in a new window) / Getty Images

If you ask GPT-4 to do a passage in the style of Carmen Machado or Margaret Atwood or Alexander Chee, it will do a fair job at it, and for good reason: It likely ingested all their works in the training process, and now uses their ingenuity for its own. But these authors, and thousands more, are not happy with this fact.

In an open letter signed by more than 8,500 authors of fiction, non-fiction and poetry, the tech companies behind large language models like ChatGPT, Bard, LLaMa and more are taken to task for using their writing without permission or compensation.

“These technologies mimic and regurgitate our language, stories, style, and ideas. Millions of copyrighted books, articles, essays, and poetry provide the ‘food’ for AI systems, endless meals for which there has been no bill,” the letter reads.

Despite their systems proving capable of quoting and imitating the authors in question, AI developers have not substantially addressed the provenance of these works. Are they trained on samples scraped from bookstores and reviews? Did they borrow every book from the library? Or perhaps they simply downloaded one of the many illegal archives, like Libgen?

One thing is certain: They did not go to publishers and license them — no doubt the preferred method, and arguably the only legal and ethical one. As the authors write:

Not only does the recent Supreme Court decision in Warhol v. Goldsmith make clear that the high commerciality of your use argues against fair use, but no court would excuse copying illegally sourced works as fair use. As a result of embedding our writings in your systems, generative AI threatens to damage our profession by flooding the market with mediocre, machine-written books, stories, and journalism based on our work.

Indeed we have already seen this occurring. Recently a number of AI-generated works of very low quality were climbing the YA best-seller lists at Amazon; publishers are inundated with generated works; and every day this very website (and shortly, this post) is scraped for content to be repurposed into chum for SEO.

Was I an AI all along?

These malicious actors are using the tools, APIs and agents developed by the likes of OpenAI and Meta, which themselves can be said to be malicious actors themselves in this context. After all, who else would knowingly steal millions of works to power a new commercial product? (Well, Google, of course — but search indexing is meaningfully different from AI ingestion, and Google Books at least had the excuse that it was meant to be a dedicated index.)

With fewer authors able to make a living writing due to the complexities and narrow margins of large-scale publishing, the open letter warns that this is an untenable situation for them, especially newer authors, “especially young writers and voices from under-represented communities.”

The letter asks the companies to do the following:

1. Obtain permission for use of our copyrighted material in your generative AI programs.

2. Compensate writers fairly for the past and ongoing use of our works in your generative AI programs.

3. Compensate writers fairly for the use of our works in AI output, whether or not the outputs are infringing under current law.

No legal threat is made — as the CEO of The Author’s Guild (and signatory) Mary Rasenberger told NPR, “Lawsuits are a tremendous amount of money. They take a really long time.” And AI is harming authors now.

Which company will be the first to say “yes, we built our AI on stolen works and we’re sorry, and we’re going to pay for it”? It’s anyone’s guess, but there seems to be little incentive to do so. Most people are not aware or concerned that LLMs are created through what amount to illicit means, and that they may in fact contain and regurgitate copyrighted works. It’s easier to see the (very similar) problem when it’s a generated image reproducing an artist’s distinctive style, and there is some pushback there.

But the subtler harm of using all of George Saunders or Diana Gabaldon’s books as “food” for one’s AI may not spur as many to action — though plenty of authors are ready to fight.

Science fiction publishers are being flooded with AI-generated stories

More TechCrunch

Madison Long and Simon May founded Clutch in 2020 to help connect people to businesses looking for marketing and content creation.

Digital marketing startup Plaiced has acquired Precursor Ventures-backed Clutch

With the CrowdStrike update continuing to cause havoc across the planet, a startup has raised $13.5 million to at least improve some level of security for the kinds of devices…

ZeroTier raises $13.5M to help avert CrowdStrike-like network problems

Apple has reduced prices of its iPhone models in India by 3-4% following a cut in import duties in the South Asian market.

Apple cuts iPhone price in India amid China slowdown

MNT-Halan, a fintech unicorn out of Egypt, is on a consolidation march. The microfinance and payments startup has raised $157.5 million in funding and is using the money in part…

Egypt’s MNT-Halan banks $157.5M, gobbles up a fintech in Turkey to expand

The energy transition is a marathon, not a sprint. But opportunities for acceleration are growing. Swedish startup Greenely* has just spotted one. It’s closing an €8 million Series A funding…

Energy tech startup Greenely grabs €8M to reach more households and support Europe’s energy transition

The Floorr offers tools for conducting sales, hosting tailored styling sessions, creating mood boards, and engaging in text or voice chats with clients, all in one place. 

Luxury fashion startup The Floorr empowers personal stylists with tools to grow their businesses

A decade-old drama involving VC David Sacks and Rippling founder Parker Conrad has blown up on X with many among the Silicon Valley elite taking sides.

Here’s why David Sacks, Paul Graham and other big Silicon Valley names had a brawl on X over VC behavior

ChatGPT, OpenAI’s text-generating AI chatbot, has taken the world by storm since its launch in November 2022. What started as a tool to hyper-charge productivity through writing essays and code…

ChatGPT: Everything you need to know about the AI-powered chatbot

Autonomous vehicle software startup Applied Intuition has closed a $300 million secondary sale just four months after raising a $250 million Series E round, yet another sign of how white-hot…

Applied Intuition closes $300M secondary four months after raising $250M

OpenAI may have designs to get into the search game — challenging not only upstarts like Perplexity, but Google and Bing, too. The company on Thursday unveiled SearchGPT, a search…

With Google in its sights, OpenAI unveils SearchGPT

The California Supreme Court ruled Thursday that Proposition 22 — the ballot measure that passed in November 2020 and classified app-based gig workers as independent contractors rather than employees —…

Uber, Lyft, DoorDash can continue to classify drivers as contractors in California

WhatsApp has recently ramped up its marketing push in the U.S.

Mark Zuckerberg says WhatsApp has 100M monthly active users in the US

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility! I don’t…

Alphabet pours $5B into Waymo, Cruise scraps the Origin and Elon’s bet on autonomy

In addition to insured commitments, Archera provides consulting services to help build purchasing strategies for customers to optimize their cloud usage.

Archera helps customers access deep cloud discounts

In its bid to maintain pace with generative AI rivals like Anthropic and OpenAI, Google is rolling out updates to the no-fee tier of Gemini, its AI-powered chatbot. The updates…

Google makes its Gemini chatbot faster and more widely available

Until a year ago, Arjun Pillai had the comfortable yet important role of chief data officer at ZoomInfo, a B2B database company. But the serial entrepreneur was getting antsy. He…

ZoomInfo alum raises $15M for startup that builds AI sales engineers

Substack is rolling out the ability for writers to draft and publish new posts directly from their phone via its iOS app, the company announced on Thursday. Until now, users…

Substack writers can now draft and publish posts in iOS app

Disrupt 2024 is the premier event where tech careers are launched, connections are forged, and the future of technology talent takes center stage. The Disrupt Career Fair is the perfect…

Disrupt 2024 Career Fair: Your gateway to top tech talent

Featured Article

Hacked, leaked, exposed: Why you should never use stalkerware apps

Using stalkerware is creepy, unethical, potentially illegal, and puts your data and that of your loved ones in danger.

Hacked, leaked, exposed: Why you should never use stalkerware apps

Featured Article

Endeavor CEO says long-term capital needs to be prioritized in emerging ecosystems

Venture capital has become a more global industry as the tech sector slowly decentralizes. In 2022, more than 50% of VC deployed globally was invested in startups outside the U.S., according to data available from the National Science Foundation (NSF) — a stark contrast to 20 years ago, when nearly…

Endeavor CEO says long-term capital needs to be prioritized in emerging ecosystems

Featured Article

Data breach exposes US spyware maker behind Windows, Mac, Android and Chromebook malware

Exclusive: The Minnesota-based spyware maker Spytech snooped on thousands of devices before it was hacked earlier this year.

Data breach exposes US spyware maker behind Windows, Mac, Android and Chromebook malware

The e-commerce market in South Korea ranks as one of the largest in the world, but it’s also proving to be a precarious one. On Thursday, South Korea’s Fair Trade…

Singaporean e-commerce firm Qoo10’s Korean units face probe due to payment delays to merchants

Don Burnette, CEO and co-founder of self-driving truck startup Kodiak Robotics, had an “aha” moment when the company started working with the U.S. Department of Defense.  Kodiak’s mission has always…

Kodiak Robotics is taking self-driving trucks off-road to reach profitability faster

Satellites are among our most critical infrastructure, providing everything from GPS to disaster coordination, yet their inherent inaccessibility leaves them vulnerable to relatively simple technical issues or attacks. London-based Lodestar…

Lodestar’s robotic arm will be an orbital ‘first responder’ for satellites in need

Voice recognition is getting integrated in nearly all facets of modern living, but there remains a big gap: Speakers of minority languages and those with thick accents or speech disorders…

Intron Health gets backing for its speech-recognition tool that recognizes African accents

The startup has developed a way to create copper and aluminum foils that are laced with tiny holes and riddled with undulating peaks and valleys.

GM-backed Addionics aims to make lithium-ion batteries cheaper with wavy foil

This is a significant milestone for the London-based fintech company, particularly since it has been trying to secure this license since 2021.

Revolut receives long-awaited UK banking license

The Board wants Meta to change the terminology it uses for labeling explicit, AI-generated images from “derogatory” to “non-consensual.”

Oversight Board wants Meta to refine its policies around AI-generated explicit images

Google Maps is improving navigation through flyovers and narrow roads in India through new feature updates.

Google Maps adds a slew of features to entice Indian drivers, commuters and travelers

Public market investors have a large variety of infrastructure and software that helps them keep track of, analyze and manage their investments, but that’s not the case for investors in…

bunch raises $15.5M for its platform that simplifies investment management for VCs