AI Crawler
ClaudeBot
Anthropic's web crawler. Powers Claude's retrieval and training. Honour robots.txt; access controls are increasingly material to citation share.
What it is
ClaudeBot is Anthropic's web crawler that retrieves publicly available pages to support Claude's retrieval and training pipelines. It declares the ClaudeBot user-agent and respects robots.txt directives.
Why it matters
As Claude grows as an answer surface, ClaudeBot access is increasingly material to whether your content is eligible for retrieval and citation in Claude responses. Blocking it can quietly erode your citation share on that platform.
How it works
Configure access by naming the ClaudeBot user-agent in robots.txt with Allow or Disallow rules, applied site-wide or per path. Because it honours robots.txt, granular path control behaves predictably.
When it applies
Allow ClaudeBot when you want eligibility for Claude retrieval and citation; block it for confidential, low-value, or licence-restricted sections.
Examples
- robots.txt: User-agent: ClaudeBot then Allow: / to permit full retrieval
- robots.txt: User-agent: ClaudeBot then Disallow: /internal/ to fence off private areas
- Server log shows ClaudeBot fetching sitemap.xml followed by a burst of article URLs
How it is measured
- Daily crawl request count attributed to the ClaudeBot user-agent
- Coverage of priority URLs fetched by ClaudeBot versus total indexable pages
- Crawl frequency and revisit interval for updated pages
- HTTP status distribution returned to ClaudeBot, watching for unexpected 4xx or 5xx
Related terms in AI Crawler
- Google-ExtendedGoogle's opt-out signal for using your content in Bard, Gemini, and AI-powered Search features without affecting classical Search ranking. A separate lever from Googlebot.
- GPTBotOpenAI's web crawler. Used to gather training data and to power ChatGPT browsing. Can be allowed or disallowed in robots.txt; blocking may reduce ChatGPT citation eligibility.
- OAI-SearchBotOpenAI's user-agent for ChatGPT Search retrieval (distinct from GPTBot, which is for training). Allowing this is required for inclusion in ChatGPT Search results.
- PerplexityBotPerplexity's crawler. Allows real-time retrieval for Perplexity answers. Blocking reduces eligibility for Perplexity citation.