Bot intelligence record
Diffbot-User
Review firstUse the Diffbot-User identifier to separate Diffbot scraping, SEO, or data-collection traffic from normal visitor requests in server logs.
- Operator
- Diffbot
- Family
- Diffbot
- Type
- Scraper
- Source type
- Official
- Last checked
- 2026-05-20
User-Agent Pattern
DiffbotDiffbot-User
User-agent strings are identification signals, not proof of identity. Confirm important allow, block, or rate-limit decisions with logs, DNS or IP evidence, request behavior, or operator documentation when available.
Robots.txt Snippet
Click snippet to copyUser-agent: Diffbot-User Disallow: /
Click the snippet to copy it, or highlight the text manually.
Handling Guidance
DependsUse this record as bot intelligence, then verify the request source and behavior before allowing, blocking, or rate limiting.
Public web data collection, SEO analysis, content extraction, or third-party crawling activity.
Record Details
Structured data- Operator
- Diffbot
- Family
- Diffbot
- Type
- Scraper
- Purpose
- Scraping
- Identity type
- Official Documented
- Confidence
- High
- Last verified
- 2026-04-29
- Last checked
- 2026-05-20
- Source type
- Official
- Verification
- Compare the observed user-agent against the documented Diffbot-User pattern. Where available, confirm with operator documentation, published IP ranges, reverse DNS, signed-agent metadata, or published operator documentation, reverse DNS, published IP ranges, signatures, or other trust signals.
- Spoofing risk
- User-agent strings can be spoofed. For allow-listing or low-friction rules, pair the published identifier with operator documentation or reverse DNS/IP verification when available.
Notes
Diffbot-User is listed in the Botcrawl directory as a crawler used for scraping, SEO analysis, or data collection from Diffbot. The primary identifier for log review is Diffbot-User.
Identification
- User-agent pattern:
Diffbot-User - Family: Diffbot
- Type: Scraper
- Kind: Fetcher
Common use
Public web data collection, SEO analysis, content extraction, or third-party crawling activity.
Verification and handling
Confirm the user-agent against server logs and use published operator documentation, IP ranges, reverse DNS, or other trust signals when available.
Directory guidance marks the risk level as Neutral and the blocking decision as Depends. Do not rely on the user-agent string alone because user-agent strings can be copied or spoofed.
Robots.txt handling: Yes.
Evidence and Source
- Compare the observed user-agent against the documented Diffbot-User pattern. Where available, confirm with operator documentation, published IP ranges, reverse DNS, signed-agent metadata, or published operator documentation, reverse DNS, published IP ranges, signatures, or other trust signals.
- Match `Diffbot-User` as a case-insensitive substring in HTTP user-agent logs. Review bot_aliases for alternate names or product labels. Do not treat a user-agent match alone as proof of identity for allow-listing.
- Public web data collection, SEO analysis, content extraction, or third-party crawling activity.
- User-agent strings can be spoofed. For allow-listing or low-friction rules, pair the published identifier with operator documentation or reverse DNS/IP verification when available.
Monitor This Bot In Edge
Botcrawl EdgeUse Botcrawl Edge to see matching traffic, create allow or block rules, and control this bot across connected sites.
