Applebot
Apple crawler for search and related Apple features.
Apple control token used to opt out of Apple using crawled content to train foundation models.
User-agent: Applebot-Extended
Disallow: /
(http.user_agent contains "Applebot-Extended")
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} Applebot\-Extended [NC]
RewriteRule ^ - [F,L]
if ($http_user_agent ~* "Applebot-Extended") { return 403; }
Applebot-Extended is not a crawler. Apple says it is used only to determine how Apple may use data already crawled by Applebot, especially for foundation-model training.
Disallowing Applebot-Extended does not remove pages from Apple’s search features if Applebot itself is still allowed.
Apple crawler for search and related Apple features.
The Apple App Site Association is used to support "Universal Links" that can open in native iOS apps.
Apple Podcasts crawler for registered podcast content.
Google control token for Gemini training and grounding permissions.
Webz.io extended web crawler that maintains a repository of web crawl data.
