Confidence: Medium
FishBot
FishBot crawls webpages to deliver Open Source AI for All
Verified robots.txt: No Neutral Block: Depends
The Yext Crawler provides Yext customers with a tool to retrieve data from their own websites.
User-agent: YextBot
Disallow: /
(http.user_agent contains "YextBot")
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} YextBot [NC]
RewriteRule ^ - [F,L]
if ($http_user_agent ~* "YextBot") { return 403; }
Known user-agent patterns: YextBot Known user-agent strings: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/87.0.4280.88 YextBot/Java Safari/537.36 Robots.txt handling in the directory: no.
Operator documentation: https://www.yext.com
FishBot crawls webpages to deliver Open Source AI for All
A content based scraper only for partners we collaborate with who have given permission to have their website scraped.
Yandex crawler for mobile-layout checks.
Yahoo Mail Proxy is a content fetch proxy that retrieves the page content of URLs that are embedded within emails sent to Yahoo Mail users.
