Diffbot is listed in the Botcrawl directory as a search crawler from Diffbot. The primary identifier for log review is Diffbot.
Identification
- User-agent pattern:
Diffbot - Family: Diffbot
- Type: Search
- Kind: Crawler
Common use
Search indexing, content discovery, rendering, or search-result freshness checks.
Verification and handling
Confirm the user-agent against server logs and use published operator documentation, IP ranges, reverse DNS, or other trust signals when available.
Directory guidance marks the risk level as Neutral and the blocking decision as Depends. Do not rely on the user-agent string alone because user-agent strings can be copied or spoofed.
Robots.txt handling: Yes.
Identification
Verification And Behavior
Common Use
Search indexing, content discovery, rendering, or search-result freshness checks.
Detection Notes
Match `Diffbot` as a case-insensitive substring in HTTP user-agent logs. Review bot_aliases for alternate names or product labels. Do not treat a user-agent match alone as proof of identity for allow-listing.
Rules And Blocking Notes
User-agent: Diffbot Disallow: / 