Cloudflare offers free tool to block AI bots

The San Francisco-based internet service provider Cloudflare has introduced a new tool that blocks AI bots, web crawlers and scraping software.

In order to train AI tools, a lot of data is required. Most of the time developers browse the internet for training data. By using AI crawlers or scraping software, they collect the data they need.

Not all online sources are happy with these practices. In an attempt to stop these crawlers, they add an extra line to their robot.txt file, saying they don’t want certain so-called user-agents to scrape their website.

However, not all AI crawlers pay attention to their wishes. In a recent report Business Insider wrote that companies like OpenAI and Anthropic are ignoring requests from media publishers not to scrape their content for training purposes for AI tools.

“We hear clearly that customers don’t want AI bots visiting their websites, and especially those that do so dishonestly,” Cloudflare states in a blog post. That’s why the American internet service provider created a new tool that blocks all AI bots with just one click.

Cloudflare analyzed their network traffic to identify the AI crawler user-agents with the highest request volumes. Bytespider, which is operated by ByteDance, the Chinese company that owns TikTok, came on top, followed by GPTBot and ClaudeBot.

Analysis of the request volumes showed that AI bots accessed around 39 percent of the top one million internet properties using Cloudflare. But only 2.98 percent tried to actually block those requests. However, these blocks are only reliant if a bot operator respects a robot.txt.

“Sadly, we’ve observed bot operators attempt to appear as though they are a real browser by using a spoofed user-agent. We’ve monitored this activity over time, and we’re proud to say that our global machine learning model has always recognized this activity as a bot, even when operators lie about their user agent,” Cloudflare adds.

The AI crawler blocking feature is called ‘AI Scrapers and Crawlers’ and is located in the ‘Bots’ menu in the Cloudflare dashboard. To activate it, you just have to turn on the toggle. The new tool is available for all Cloudflare customers.

