Bot intelligence record

Diffbot-User

Review first

Use the Diffbot-User identifier to separate Diffbot scraping, SEO, or data-collection traffic from normal visitor requests in server logs.

Scraper Scraping Official Documented Confidence: High Verified: Yes robots.txt: Yes
Operator
Diffbot
Family
Diffbot
Type
Scraper
Source type
Official
Last checked
2026-05-20

User-Agent Pattern

Diffbot
Diffbot-User
Verification note

User-agent strings are identification signals, not proof of identity. Confirm important allow, block, or rate-limit decisions with logs, DNS or IP evidence, request behavior, or operator documentation when available.

Robots.txt Snippet

Click snippet to copy
User-agent: Diffbot-User Disallow: /

Click the snippet to copy it, or highlight the text manually.

Handling Guidance

Depends

Use this record as bot intelligence, then verify the request source and behavior before allowing, blocking, or rate limiting.

Public web data collection, SEO analysis, content extraction, or third-party crawling activity.

Record Details

Structured data
Operator
Diffbot
Family
Diffbot
Type
Scraper
Purpose
Scraping
Identity type
Official Documented
Confidence
High
Last verified
2026-04-29
Last checked
2026-05-20
Source type
Official
Verification
Compare the observed user-agent against the documented Diffbot-User pattern. Where available, confirm with operator documentation, published IP ranges, reverse DNS, signed-agent metadata, or published operator documentation, reverse DNS, published IP ranges, signatures, or other trust signals.
Spoofing risk
User-agent strings can be spoofed. For allow-listing or low-friction rules, pair the published identifier with operator documentation or reverse DNS/IP verification when available.

Notes

Diffbot-User is listed in the Botcrawl directory as a crawler used for scraping, SEO analysis, or data collection from Diffbot. The primary identifier for log review is Diffbot-User.

Identification

  • User-agent pattern: Diffbot-User
  • Family: Diffbot
  • Type: Scraper
  • Kind: Fetcher

Common use

Public web data collection, SEO analysis, content extraction, or third-party crawling activity.

Verification and handling

Confirm the user-agent against server logs and use published operator documentation, IP ranges, reverse DNS, or other trust signals when available.

Directory guidance marks the risk level as Neutral and the blocking decision as Depends. Do not rely on the user-agent string alone because user-agent strings can be copied or spoofed.

Robots.txt handling: Yes.

Evidence and Source

  • Compare the observed user-agent against the documented Diffbot-User pattern. Where available, confirm with operator documentation, published IP ranges, reverse DNS, signed-agent metadata, or published operator documentation, reverse DNS, published IP ranges, signatures, or other trust signals.
  • Match `Diffbot-User` as a case-insensitive substring in HTTP user-agent logs. Review bot_aliases for alternate names or product labels. Do not treat a user-agent match alone as proof of identity for allow-listing.
  • Public web data collection, SEO analysis, content extraction, or third-party crawling activity.
  • User-agent strings can be spoofed. For allow-listing or low-friction rules, pair the published identifier with operator documentation or reverse DNS/IP verification when available.

Monitor This Bot In Edge

Botcrawl Edge

Use Botcrawl Edge to see matching traffic, create allow or block rules, and control this bot across connected sites.