Bot intelligence record

AI2Bot

Review first

Allen Institute for AI crawler used for AI research and related indexing workflows.

Ai AI Research Crawling Official Documented Confidence: High Verified: Yes robots.txt: Yes
Operator
Allen Institute for AI
Type
Ai
Source type
Official
Last checked
2026-06-02

User-Agent Pattern

Allen Institute for AI
AI2Bot
Verification note

User-agent strings are identification signals, not proof of identity. Confirm important allow, block, or rate-limit decisions with logs, DNS or IP evidence, request behavior, or operator documentation when available.

Robots.txt Snippet

Click snippet to copy
User-agent: AI2Bot Disallow: /

Click the snippet to copy it, or highlight the text manually.

Handling Guidance

Depends

Use this record as bot intelligence, then verify the request source and behavior before allowing, blocking, or rate limiting.

Academic and AI research crawling, including discovery of web documents for Allen Institute research systems.

Record Details

Structured data
Operator
Allen Institute for AI
Type
Ai
Purpose
AI Research Crawling
Identity type
Official Documented
Confidence
High
Last verified
2026-06-02
Last checked
2026-06-02
Source type
Official
Verification
Validate the identifying user-agent against operator documentation, reverse DNS, published IP ranges, signatures, or other trust signals before creating hard allow rules.
Spoofing risk
User-agent strings can be spoofed. Pair the claimed identifier with operator documentation, IP verification, reverse DNS, signatures, or other available trust signals before creating low-friction allow rules.

Notes

AI2Bot is listed in the Botcrawl directory as a ai bot from Allen Institute for AI. The primary identifier for log review is AI2Bot.

Identification

  • User-agent pattern: AI2Bot
  • Family: Allen Institute for AI
  • Type: Ai
  • Kind: Crawler

Common use

Academic and AI research crawling, including discovery of web documents for Allen Institute research systems.

Verification and handling

Confirm the user-agent against server logs and use published operator documentation, IP ranges, reverse DNS, signatures, or other trust signals when available.

Directory guidance marks the risk level as Neutral and the blocking decision as Depends. Do not rely on the user-agent string alone because user-agent strings can be copied or spoofed.

Robots.txt handling: Yes.

Evidence and Source

  • Validate the identifying user-agent against operator documentation, reverse DNS, published IP ranges, signatures, or other trust signals before creating hard allow rules.
  • Match `AI2Bot` as a case-insensitive substring in HTTP user-agent logs. Do not treat a user-agent match alone as proof of identity for allow-listing. Official AllenAI documentation publishes the AI2Bot user-agent string.
  • Academic and AI research crawling, including discovery of web documents for Allen Institute research systems.
  • User-agent strings can be spoofed. Pair the claimed identifier with operator documentation, IP verification, reverse DNS, signatures, or other available trust signals before creating low-friction allow rules.

Monitor This Bot In Edge

Botcrawl Edge

Use Botcrawl Edge to see matching traffic, create allow or block rules, and control this bot across connected sites.