AI Crawler

ClaudeBot

Anthropic's web crawler. Powers Claude's retrieval and training. Honour robots.txt; access controls are increasingly material to citation share.

What it is

ClaudeBot is Anthropic's web crawler that retrieves publicly available pages to support Claude's retrieval and training pipelines. It declares the ClaudeBot user-agent and respects robots.txt directives.

Why it matters

As Claude grows as an answer surface, ClaudeBot access is increasingly material to whether your content is eligible for retrieval and citation in Claude responses. Blocking it can quietly erode your citation share on that platform.

How it works

Configure access by naming the ClaudeBot user-agent in robots.txt with Allow or Disallow rules, applied site-wide or per path. Because it honours robots.txt, granular path control behaves predictably.

When it applies

Allow ClaudeBot when you want eligibility for Claude retrieval and citation; block it for confidential, low-value, or licence-restricted sections.

Examples

robots.txt: User-agent: ClaudeBot then Allow: / to permit full retrieval
robots.txt: User-agent: ClaudeBot then Disallow: /internal/ to fence off private areas
Server log shows ClaudeBot fetching sitemap.xml followed by a burst of article URLs

How it is measured

Daily crawl request count attributed to the ClaudeBot user-agent
Coverage of priority URLs fetched by ClaudeBot versus total indexable pages
Crawl frequency and revisit interval for updated pages
HTTP status distribution returned to ClaudeBot, watching for unexpected 4xx or 5xx

ClaudeBot

Related terms in AI Crawler

Stay ahead of AI Search