AI Crawler
Google-Extended
Google's opt-out signal for using your content in Bard, Gemini, and AI-powered Search features without affecting classical Search ranking. A separate lever from Googlebot.
What it is
Google-Extended is a robots.txt control token that lets publishers opt out of having content used to ground and improve Gemini and Google's AI Search features. It is not a crawler with its own fetch traffic but a policy signal honoured by Google.
Why it matters
It separates AI usage from classical ranking, so you can stay fully indexed in Search while choosing whether your content feeds AI Overviews and Gemini answers. This makes it a precise lever for managing AI citation exposure without sacrificing organic visibility.
How it works
Add a Google-Extended block in robots.txt; a Disallow opts your content out of AI grounding while Googlebot continues to crawl for Search ranking. Because it is a separate token, rules for Googlebot and Google-Extended are evaluated independently.
When it applies
Disallow Google-Extended when you want to remain in classical Search but withhold content from Gemini and AI Search grounding; allow it when you want AI citation eligibility.
Examples
- robots.txt: User-agent: Google-Extended then Disallow: / to opt out of AI grounding while staying indexed
- robots.txt: User-agent: Google-Extended then Allow: / kept alongside an unrestricted Googlebot block
- Googlebot continues fetching pages in logs while AI Overview coverage drops after a Google-Extended Disallow
How it is measured
- Presence or absence of your URLs in AI Overview and Gemini citations over time
- Stability of Googlebot crawl volume after changing Google-Extended rules, confirming ranking is unaffected
- Search Console impressions for affected pages, used to verify classical Search is untouched
- Tracked appearance rate of the domain in AI-generated answer panels
Related terms in AI Crawler
- ClaudeBotAnthropic's web crawler. Powers Claude's retrieval and training. Honour robots.txt; access controls are increasingly material to citation share.
- GPTBotOpenAI's web crawler. Used to gather training data and to power ChatGPT browsing. Can be allowed or disallowed in robots.txt; blocking may reduce ChatGPT citation eligibility.
- OAI-SearchBotOpenAI's user-agent for ChatGPT Search retrieval (distinct from GPTBot, which is for training). Allowing this is required for inclusion in ChatGPT Search results.
- PerplexityBotPerplexity's crawler. Allows real-time retrieval for Perplexity answers. Blocking reduces eligibility for Perplexity citation.