Many of the biggest websites have opted out of Apple Intelligence training

August 30, 2024

Generative AI systems are trained by letting them surf the web to scrape content. Apple allows publishers to opt out of its scraping, and a new report says that many of the biggest websites have specifically opted out of Apple Intelligence training.

This includes both Facebook and Instagram, as well as many high-profile news and media sites like The New York Times and The Atlantic …

Apple’s AI training

Large language models like ChatGPT are trained by giving them access to millions of words of source material, ranging from news stories to user comments.

In Apple’s case, the company has for years been using Applebot to train Siri and surface Spotlight suggestions. More recently, the company has also been using Applebot to train Apple Intelligence.

The practice is controversial, as AIs are effectively using copyrighted material to generate their own versions of it. For more niche topics, where source material is scarce, they have even been found to regurgitate entire paragraphs with almost no changes made.

But Apple does this in an ethical way, allowing publishers to opt out, and screening out personal data (though it did get caught out by one third-party source).

We train our foundation models on licensed data, including data selected to enhance specific features, as well as publicly available data collected by our web-crawler, AppleBot. Web publishers have the option to opt out of the use of their web content for Apple Intelligence training with a data usage control […]

We apply filters to remove personally identifiable information like social security and credit card numbers that are publicly available on the Internet.

Apple uses an Applebot-Extended tag to allow sites to opt out of AI training while still allowing search indexing – meaning that their pieces can still be included in Spotlight and Siri searches.

Many big web publishers opting out

Since opting out is done using a publicly-accessible robots.txt file, it’s easy to see which sites have done this. Wired checked a number of the biggest news and social media sites.

WIRED can confirm that Facebook, Instagram, Craigslist, Tumblr, The New York Times, The Financial Times, The Atlantic, Vox Media, the USA Today network, and WIRED’s parent company, Condé Nast, are among the many organizations opting to exclude their data from Apple’s AI training […]

In a separate analysis conducted this week, data journalist Ben Welsh found that just over a quarter of the news websites he surveyed (294 of 1,167 primarily English-language, US-based publications) are blocking Applebot-Extended.

Applebot-Extended is a relatively new tag, so it’s likely that more websites will also opt out once awareness increases.

Money is of course one factor

Apple is believed to have struck deals with some media companies, paying a fee in return for the right to use their content for training. It’s likely this is the motivation for at least some sites currently blocking Apple – holding out for a payment offer.

“A lot of the largest publishers in the world are clearly taking a strategic approach,” says Originality AI founder Jon Gillham. “I think in some cases, there’s a business strategy involved—like, withholding the data until a partnership agreement is in place.”

iOS 18.1 beta 3 includes several new Apple Intelligence features, including Photo Clean Up and more notification summaries.

The stock market has never looked like this before — regardless…

VeriPark and Canadian credit unions team up on digital banking

China keeps benchmark lending rates unchanged as it contends with a…

Billionaires Are Selling Nvidia and Buying a Bitcoin ETF That Cathie…

Mortgage rates climb above 7% to highest level since May

FTSE 100 hits another record high, investors eye Trump policies

Bitcoin touches new high of $109,000, then reverses in volatile session…

Wall Street soars on record bank profits and cooling inflation

How Nigeria’s $20B Refinery Disrupts European Markets

London’s FTSE 100 closes at record high; Novo Nordisk slides after…

A 2025 401(k) rule change means adults 50 and older can…

Here’s the average Social Security check for more than 68 million…

4 2025 Retirement Moves That Could Make You a Millionaire by…

(Many) retired teachers are getting a raise thanks to the Social…

Retirement expert details the ‘highest single correlation’ to success

TikTok starts restoring service in the U.S. after shutting down over…

Supplier SK On claims solid-state EV battery “breakthroughs”

GameGPT Launches The Revolutionary Genesis AI NFT Collection, Combining AI and…

Instagram rolls out TikTok-like features amid uncertainty about rival’s future

Robots should be repurposed rather than recycled to combat rising scale…

Many of the biggest websites have opted out of Apple Intelligence training

Apple’s AI training

Many big web publishers opting out

Money is of course one factor

Most Viewed

You can now find Lime scooters and bikes in more cities...

Scientists uncover new evidence of the asteroid that killed off the...

In the Garden: Tips to save money this year

Trending Now

VeriPark and Canadian credit unions team up on digital banking

The stock market has never looked like this before — regardless of who’s president

FTSE 100 hits another record high, investors eye Trump policies

Many of the biggest websites have opted out of Apple Intelligence training

Apple’s AI training

Many big web publishers opting out

Money is of course one factor

RELATED ARTICLES

Most Viewed

Trending Now