Cloudflare aims to save the World Wide Web by blocking AI crawlers without explicit consent

4 months ago 15

ARTICLE AD BOX

Cloudflare introduces a consent-based approach for AI crawlers, giving website operators more power over their content. Media companies and platforms hope it will create fairer terms for original content online.

Cloudflare will now, by default, prompt owners of new websites to block AI crawlers that try to access content without explicit permission. The company says it's trying to restore balance between content creators and AI companies.

Under the new system, AI companies can disclose if they're collecting data for training, inference, or search. Website operators can then decide whether to grant access.

Cloudflare argues that the current model - where AI systems scrape content automatically and use it without attribution or payment - is no longer acceptable.

THE DECODER Newsletter

The most important AI news straight to your inbox.

✓ Weekly

✓ Free

✓ Cancel at any time

"Original content is what makes the Internet one of the greatest inventions in the last century, and it's essential that creators continue making it," said CEO Matthew Prince in a press release.

Cloudflare warns that without incentives to create high-quality content, the long-term health of the web is at risk. The company's own business depends on people running and visiting websites.

Crawler access now requires consent

Since September 2024, Cloudflare customers have been able to block AI crawlers with a single click. The company says more than one million customers have already used the feature.

Cloudflare is taking things a step further: when customers register new domains, they're now prompted by default to allow or deny access to AI crawlers. Operators no longer need to adjust these settings manually, but can still change their preferences at any time.

To tighten control further, Cloudflare is helping develop a new protocol so AI bots can identify themselves in a standardized way. The aim is to give website operators more reliable verification tools and block anonymous crawlers from abusing access.

Recommendation

Media companies back the new controls

The move has drawn support from major media companies and platforms. Roger Lynch, CEO of Condé Nast, calls it "a critical step toward creating a fair value exchange on the Internet." Neil Vogel of Dotdash Meredith says the change means access can now be restricted to partners "willing to engage in fair arrangements."

Renn Turiano of Gannett Media, which runs the USA TODAY network, stated that blocking unauthorized scraping and the use of its original content without fair compensation is "critically important." Pinterest, Reddit, and Ziff Davis also see the model as a step toward a more sustainable digital ecosystem.

The consent-based approach has backing from TIME, BuzzFeed, The Atlantic, Snopes, Sky News Group, USA TODAY Network, Reddit, Quora, Pinterest, and Stack Overflow, among others. Supporters also include organizations like Digital Content Next, IAB Tech Lab, and O'Reilly Media. All see it as a first step toward a fairer relationship between AI operators and digital content creators.

Cloudflare CEO warns of a collapsing open web

In a recent interview, Cloudflare CEO Matthew Prince warned that the open web's business model is under serious threat. Ten years ago, Google would send one visitor to a site for every two pages crawled. Today, the ratio is 18:1. With AI models from OpenAI or Anthropic, he says, as many as 60,000 pages may be scraped for every single visitor the site receives.

Early studies show that AI-powered answer engines, even when they credit their sources, can significantly reduce traffic to website owners. These models are trained on large-scale datasets built from original website content, and their answers are composed in real time using up-to-date information pulled directly from website owners' pages. Without the open web and its content, these services wouldn't exist.

Read Entire Article