Bot intelligence record
CCBot
Review firstUse the CCBot identifier to separate Common Crawl security scanning or verification traffic from normal visitor requests in server logs.
- Operator
- Common Crawl
- Family
- Common Crawl
- Type
- Security
- Source type
- Official
- Last checked
- 2026-05-20
User-Agent Pattern
Common CrawlCCBot
User-agent strings are identification signals, not proof of identity. Confirm important allow, block, or rate-limit decisions with logs, DNS or IP evidence, request behavior, or operator documentation when available.
Robots.txt Snippet
Click snippet to copyUser-agent: CCBot Disallow: /
Click the snippet to copy it, or highlight the text manually.
Handling Guidance
DependsUse this record as bot intelligence, then verify the request source and behavior before allowing, blocking, or rate limiting.
Security scanning, malware checks, abuse prevention, compliance review, or vulnerability monitoring.
Record Details
Structured data- Operator
- Common Crawl
- Family
- Common Crawl
- Type
- Security
- Purpose
- Security
- Identity type
- Official Documented
- Confidence
- High
- Last verified
- 2026-04-01
- Last checked
- 2026-05-20
- Source type
- Official
- Verification
- Verify reverse DNS in crawl.commoncrawl.org and match IPs against Common Crawl's published JSON ranges.
- IP ranges
- https://index.commoncrawl.org/ccbot.json
- Spoofing risk
- User-agent strings for CCBot can be spoofed. Treat user-agent detection as a classification signal, then verify with published IP ranges, reverse DNS, signatures, operator documentation, or published operator documentation, IP ranges, reverse DNS, signatures, or other verified identity signals before allow-listing.
Notes
CCBot is listed in the Botcrawl directory as a security scanner from Common Crawl. The primary identifier for log review is CCBot.
Identification
- User-agent pattern:
CCBot - Family: Common Crawl
- Type: Security
- Kind: Scanner
Common use
Security scanning, malware checks, abuse prevention, compliance review, or vulnerability monitoring.
Verification and handling
Verify reverse DNS in crawl.commoncrawl.org and match IPs against Common Crawl's published JSON ranges.
Directory guidance marks the risk level as Neutral and the blocking decision as Depends. Do not rely on the user-agent string alone because user-agent strings can be copied or spoofed.
Robots.txt handling: Yes.
Evidence and Source
- Verify reverse DNS in crawl.commoncrawl.org and match IPs against Common Crawl's published JSON ranges.
- Match `CCBot` as a case-insensitive substring in HTTP user-agent logs. Review bot_aliases for alternate names or product labels. Use bot_http_agent for full user-agent examples when the client sends a longer browser-like string. Do not treat a user-agent match alone as proof of identity for allow-listing.
- Security scanning, malware checks, abuse prevention, compliance review, or vulnerability monitoring.
- User-agent strings for CCBot can be spoofed. Treat user-agent detection as a classification signal, then verify with published IP ranges, reverse DNS, signatures, operator documentation, or published operator documentation, IP ranges, reverse DNS, signatures, or other verified identity signals before allow-listing.
Monitor This Bot In Edge
Botcrawl EdgeUse Botcrawl Edge to see matching traffic, create allow or block rules, and control this bot across connected sites.
