scraper

Browse bots in this category.

SkypeUriPreview Neutral

Skype

Skype's URI Preview services fetches a page preview when someone posts a URL in a Skype message.

Operator and family: Skype Type: Scraper Purpose: Scraping
Verified robots.txt: No Block: Depends Verified Bot
Slack-ImgProxy Neutral

Slack Image Proxy

This robot is used to fetch and cache images posted into Slack channels.

Operator and family: Slack Type: Scraper Purpose: Scraping
Verified robots.txt: No Block: Depends Verified Bot
SmartologyBot Neutral

SmartologyBot

The Smartology generates semantic vectors from domain pages in order to serve semantically-relevant ads on those pages.

Operator and family: Smartology Type: Scraper Purpose: Scraping
Verified robots.txt: No Block: Depends Verified Bot
SteamChat Neutral

Steam Chat

The Steam Chat bot fetches previews of URLs shared within the Steam client's chat feature.

Operator and family: Valve Software Type: Scraper Purpose: Scraping
Verified robots.txt: No Block: Depends Verified Bot
Terracotta Safe

Terracotta

The Terracotta bot scrapes websites for use in generating indices for serving searches using Ceramic's search product.

Operator: Ceramic Family: Terracotta Type: Scraper Purpose: Scraping
Verified robots.txt: No Block: Depends Verified Bot
TikTokSpider Safe

TikTokSpider

TikTokSpider is a crawler used for scraping, SEO analysis, or data collection from ByteDance.

Operator and family: ByteDance Type: Scraper Purpose: Scraping
Verified robots.txt: Yes Block: Depends Verified Bot
Trellis-Services Neutral

Trellis-Services

Trellis-Services is a crawler used for scraping, SEO analysis, or data collection from Mediavine.

Operator and family: Mediavine Type: Scraper Purpose: Scraping
Verified robots.txt: No Block: Depends Verified Bot
Tumblr Neutral

Tumblr

On Tumblr, post authors can paste a URL in their post, and we'll "unfurl" that URL into a pretty Link "Block" for their post by making a request to the URL and parsing the response.

Operator and family: Automattic Type: Scraper Purpose: Scraping
Verified robots.txt: No Block: Depends Verified Bot
Turnitin Neutral

TurnitinBot

Turnitin.com offers various services to the educational community. Most prominently, we provide a widely used and effective plagiarism detection service. Part of the plagiarism prevention.

Operator and family: Turnitin Type: Scraper Purpose: Scraping
Verified robots.txt: No Block: Depends Verified Bot
Twitterbot Neutral

twitterbot

A Twitter bot is a type of bot software that controls a Twitter account via the Twitter API. The bot software may autonomously perform actions such as tweeting, re-tweeting, liking.

Operator and family: Twitter Type: Scraper Purpose: Scraping
Verified robots.txt: No Block: Depends Verified Bot
W3C-checklink Neutral

W3 Validator Services

W3 Validator Services is a crawler used for scraping, SEO analysis, or data collection from World Wide Web Consortium (W3C).

Operator and family: World Wide Web Consortium (W3C) Type: Scraper Purpose: Scraping
Verified robots.txt: No Block: Depends Verified Bot
WordCountBot Neutral

WordCountBot

WordCountBot analyzes website word count based on public pages. All words belonging to public pages and included in HTML source code.

Operator and family: Weglot Type: Scraper Purpose: Scraping
Verified robots.txt: No Block: Depends Verified Bot
ygs-scraper-bot Neutral

YGS Group Falconer Scraper

A content based scraper only for partners we collaborate with who have given permission to have their website scraped.

Operator: YGS Group Family: YGS Group Falconer Scraper Type: Scraper Purpose: Scraping
Verified robots.txt: No Block: Depends Verified Bot