Bot intelligence record
Diffbot
Review firstUse the Diffbot identifier to separate Diffbot search indexing or content discovery traffic from normal visitor requests in server logs.
- Operator
- Diffbot
- Family
- Diffbot
- Type
- Search
- Source type
- Official
- Last checked
- 2026-05-20
User-Agent Pattern
DiffbotDiffbot
User-agent strings are identification signals, not proof of identity. Confirm important allow, block, or rate-limit decisions with logs, DNS or IP evidence, request behavior, or operator documentation when available.
Robots.txt Snippet
Click snippet to copyUser-agent: Diffbot Disallow: /
Click the snippet to copy it, or highlight the text manually.
Handling Guidance
DependsUse this record as bot intelligence, then verify the request source and behavior before allowing, blocking, or rate limiting.
Search indexing, content discovery, rendering, or search-result freshness checks.
Record Details
Structured data- Operator
- Diffbot
- Family
- Diffbot
- Type
- Search
- Purpose
- Indexing
- Identity type
- Official Documented
- Confidence
- High
- Last verified
- 2026-04-29
- Last checked
- 2026-05-20
- Source type
- Official
- Verification
- Compare the observed user-agent against the documented Diffbot pattern. Where available, confirm with operator documentation, published IP ranges, reverse DNS, signed-agent metadata, or published operator documentation, reverse DNS, published IP ranges, signatures, or other trust signals.
- Spoofing risk
- User-agent strings can be spoofed. For allow-listing or low-friction rules, pair the published identifier with operator documentation or reverse DNS/IP verification when available.
Notes
Diffbot is listed in the Botcrawl directory as a search crawler from Diffbot. The primary identifier for log review is Diffbot.
Identification
- User-agent pattern:
Diffbot - Family: Diffbot
- Type: Search
- Kind: Crawler
Common use
Search indexing, content discovery, rendering, or search-result freshness checks.
Verification and handling
Confirm the user-agent against server logs and use published operator documentation, IP ranges, reverse DNS, or other trust signals when available.
Directory guidance marks the risk level as Neutral and the blocking decision as Depends. Do not rely on the user-agent string alone because user-agent strings can be copied or spoofed.
Robots.txt handling: Yes.
Evidence and Source
- Compare the observed user-agent against the documented Diffbot pattern. Where available, confirm with operator documentation, published IP ranges, reverse DNS, signed-agent metadata, or published operator documentation, reverse DNS, published IP ranges, signatures, or other trust signals.
- Match `Diffbot` as a case-insensitive substring in HTTP user-agent logs. Review bot_aliases for alternate names or product labels. Do not treat a user-agent match alone as proof of identity for allow-listing.
- Search indexing, content discovery, rendering, or search-result freshness checks.
- User-agent strings can be spoofed. For allow-listing or low-friction rules, pair the published identifier with operator documentation or reverse DNS/IP verification when available.
Monitor This Bot In Edge
Botcrawl EdgeUse Botcrawl Edge to see matching traffic, create allow or block rules, and control this bot across connected sites.
