All terms

AI Crawler

ClaudeBot

Anthropic's web crawler. Powers Claude's retrieval and training. Honour robots.txt; access controls are increasingly material to citation share.

What it is

ClaudeBot is Anthropic's web crawler that retrieves publicly available pages to support Claude's retrieval and training pipelines. It declares the ClaudeBot user-agent and respects robots.txt directives.

Why it matters

As Claude grows as an answer surface, ClaudeBot access is increasingly material to whether your content is eligible for retrieval and citation in Claude responses. Blocking it can quietly erode your citation share on that platform.

How it works

Configure access by naming the ClaudeBot user-agent in robots.txt with Allow or Disallow rules, applied site-wide or per path. Because it honours robots.txt, granular path control behaves predictably.

When it applies

Allow ClaudeBot when you want eligibility for Claude retrieval and citation; block it for confidential, low-value, or licence-restricted sections.

Examples

  • robots.txt: User-agent: ClaudeBot then Allow: / to permit full retrieval
  • robots.txt: User-agent: ClaudeBot then Disallow: /internal/ to fence off private areas
  • Server log shows ClaudeBot fetching sitemap.xml followed by a burst of article URLs

How it is measured

  • Daily crawl request count attributed to the ClaudeBot user-agent
  • Coverage of priority URLs fetched by ClaudeBot versus total indexable pages
  • Crawl frequency and revisit interval for updated pages
  • HTTP status distribution returned to ClaudeBot, watching for unexpected 4xx or 5xx

The Discovery Digest · Every Friday

Stay ahead of AI Search

Five updates a week across ChatGPT, Claude, Gemini, Perplexity, Copilot, Grok and Google AI Overviews, with the questions worth asking.

Free5 updates weeklyUnsubscribe anytime