AI

Cloudflare launches a tool to combat AI bots

Comment

grey robot head on red background
Image Credits: Getty Images

Cloudflare, the publicly traded cloud service provider, has launched a new, free tool to prevent bots from scraping websites hosted on its platform for data to train AI models.

Some AI vendors, including Google, OpenAI and Apple, allow website owners to block the bots they use for data scraping and model training by amending their site’s robots.txt, the text file that tells bots which pages they can access on a website. But, as Cloudflare points out in a post announcing its bot-combating tool, not all AI scrapers respect this.

“Customers don’t want AI bots visiting their websites, and especially those that do so dishonestly,” the company writes on its official blog. “We fear that some AI companies intent on circumventing rules to access content will persistently adapt to evade bot detection.”

So, in an attempt to address the problem, Cloudflare analyzed AI bot and crawler traffic to fine-tune automatic bot detection models. The models consider, among other factors, whether an AI bot might be trying to evade detection by mimicking the appearance and behavior of someone using a web browser.

“When bad actors attempt to crawl websites at scale, they generally use tools and frameworks that we are able to fingerprint,” Cloudflare writes. “Based on these signals, our models [are] able to appropriately flag traffic from evasive AI bots as bots.”

Cloudflare has set up a form for hosts to report suspected AI bots and crawlers and says that it’ll continue to manually blacklist AI bots over time.

The problem of AI bots has come into sharp relief as the generative AI boom fuels the demand for model training data.

Many sites, wary of AI vendors training models on their content without alerting or compensating them, have opted to block AI scrapers and crawlers. Around 26% of the top 1,000 sites on the web have blocked OpenAI’s bot, according to one study; another found that more than 600 news publishers had blocked the bot.

Blocking isn’t a surefire protection, however. As alluded to earlier, some vendors appear to be ignoring standard bot exclusion rules to gain a competitive advantage in the AI race. AI search engine Perplexity was recently accused of impersonating legitimate visitors to scrape content from websites, and OpenAI and Anthropic are said to have at times ignored robots.txt rules.

In a letter to publishers last month, content licensing startup TollBit said that, in fact, it sees “many AI agents” ignoring the robots.txt standard.

Tools like Cloudflare’s could help — but only if they prove to be accurate in detecting clandestine AI bots. And they won’t solve the more intractable problem of publishers risking sacrificing referral traffic from AI tools like Google’s AI Overviews, which exclude sites from inclusion if they block specific AI crawlers.

More TechCrunch

The European Union has designated adult content website XNXX as subject to the strictest level of content regulation under the bloc’s Digital Services Act (DSA) after it notified the bloc…

XNXX joins handful of adult sites subject to EU’s strictest content moderation rules

Months after Microsoft gained an observer seat on OpenAI’s board, the company is leaving the position of the non-voting seat. In a letter sent to OpenAI on Tuesday, Microsoft said…

As Microsoft leaves its observer seat, OpenAI says it won’t have any more observers

SaaS founders trying to figure out what it takes to raise their next round can refer to Point Nine’s famous yearly SaaS Funding Napkin. (The term refers to “back of…

Deep tech startups with very technical CEOs raise larger rounds, research finds

Iceland’s startup scene is punching above its weight. That’s perhaps in part because it kept the 2021 hype in check, but mostly because its tech ecosystem is coming of age.…

Iceland is dodging the VC doldrums as Frumtak Ventures lands $87 million for its fourth fund

Index Ventures is announcing $2.3 billion in new funds to finance the next generation of tech startups globally. These new funds are spread across different stages with $800 million dedicated…

Index Ventures raises $2.3 billion for new venture and growth funds

Prompt engineering became a hot job last year in the AI industry, but it seems Anthropic is now developing tools to at least partially automate it. Anthropic released several new…

Anthropic’s Claude adds a prompt playground to quickly improve your AI apps

Hebbia, a startup that uses generative AI to search large documents and respond to large questions, has raised a $130 million Series B at a roughly $700 million valuation led…

AI startup Hebbia raised $130M at a $700M valuation on $13 million of profitable revenue

NovoNutrients has raised a $18 million Series A round from investors to build a pilot-scale facility to prove that its fermentation process works at scale.

NovoNutrients tweaks its bugs to turn CO2 into protein for people and pets

Seven years ago, Uber and Lyft blocked an effort to require ride-hailing app drivers to get fingerprinted in California. But by launching Uber for Teens earlier this year, the company…

Uber for Teens has reignited an old debate over fingerprinting drivers

Fast-food chain Whataburger’s app has gone viral in the wake of Hurricane Beryl, which left around 1.8 million utility customers in Houston, Texas without power. Hundreds of thousands of those…

Whataburger app becomes unlikely power outage map after Houston hurricane

Bumble’s new reporting option arrives at a time when, unfortunately, AI-generated photos on dating apps are common

Bumble users can now report profiles that use AI-generated photos

The concept of Airchat is fun, especially if you’re someone who loves to send voice memos instead of typing out long paragraphs on your phone keyboard.

Talky social app Airchat gets a major overhaul, making it more like an asynchronous Clubhouse

Featured Article

The fall of EV startup Fisker: A comprehensive timeline

Here is a timeline of the events that led fledgling automaker Fisker to file for bankruptcy.

19 hours ago
The fall of EV startup Fisker: A comprehensive timeline

Ahead of these potential competitors comes Openvibe, a simple aggregator for the open social web.

Openvibe combines Mastodon, Bluesky and Nostr into one social app

Welcome to TechCrunch Fintech! Last week was a holiday in the United States, so news was a bit lighter than normal. But there was still fintech-related items to report, including…

Should venture capitalists be held accountable when startups screw up?

Fisker Inc. co-founders Henrik Fisker and his wife, Geeta Gupta-Fisker, are lowering their salaries to $1 in order to keep their failed EV startup’s bankruptcy proceedings funded, as lawyers work…

Henrik Fisker drops salary to $1 to keep Fisker Inc. bankruptcy case alive

After announcing a whopping $20 million seed last year, Unlikely AI founder William Tunstall-Pedoe has kept the budding U.K. foundation model maker’s approach under lock and key. Until now: TechCrunch…

Alexa co-creator gives first glimpse of Unlikely AI’s tech strategy

We’re excited to invite Jesse Pollak to TechCrunch Disrupt 2024 to talk about the future of decentralization.

Jesse Pollak will tell us why Coinbase is launching its own Base blockchain at TechCrunch Disrupt 2024

Featured Article

A comprehensive list of 2024 tech layoffs

The tech layoff wave is still going strong in 2024. Following significant workforce reductions in 2022 and 2023, this year has already seen 60,000 job cuts across 254 companies, according to independent layoffs tracker Layoffs.fyi. Companies like Tesla, Amazon, Google, TikTok, Snap and Microsoft have conducted sizable layoffs in the…

22 hours ago
A comprehensive list of 2024 tech layoffs

Infactory is a kind of fact-checking search engine that will be focused exclusively on data at launch.

Humane execs leave company to found AI fact-checking startup

In a first, the Federal Trade Commission is banning an app from serving users under the age of 18. The agency announced on Tuesday that it’s banning NGL, an anonymous…

FTC bans NGL from offering its anonymous social app to minors

When people start navigation on Google Maps, the vehicle’s speed is shown in miles or kilometers, depending on the region.

Google Maps is rolling out speedometer, speed limits on iPhone and CarPlay globally

Design and animation are core to the Duolingo experience, which makes learning a new language or skill more like a game rather than a task to be dreaded.

Duolingo acquires Detroit-based design studio Hobbes

Two of my friends died within the last three years. By some coincidence, both of their birthdays fall in the beginning of July. So, twice this week, Facebook has reminded…

Facebook keeps asking me to say ‘happy birthday’ to dead people

Running a small business means doing more with less. AI agents can help, but building custom agents for specific workflows remains challenging, even with today’s low-code/no-code tools. The idea behind…

With $6M in seed funding, Enso plans to bring AI agents to SMBs

The feature puts Spotify in more direct competition with YouTube as a place where creators can interact with their listeners.

Chasing YouTube, Spotify adds comments to podcasts

A new iOS app called Wayther wants to help you better plan your road trips by giving you real-time road conditions and weather forecasts along your route. Created by indie…

Meet Wayther, an iOS weather forecast app designed specifically for road trips

Evolve has confirmed that the personal data of at least 7.6 million people was accessed during LockBit’s ransomware attack.

Evolve Bank says ransomware gang stole personal data on millions of customers

Etsy has been grappling with an influx of generic “junk” and AI-generated products on its platform. The service revised its seller policy on Tuesday, introducing new labels that clarify whether…

Etsy adds AI-generated item guidelines in new seller policy 

Seae Ventures is acquiring Unseen Capital after the death of founder Kayode Owens in 2021. The combined firm will continue to invest in healthcare for minorities and underserved populations. Owens,…

Seae Ventures acquires Unseen Capital after founder death