robots.txt

robots.txt is a plain-text file at the root of a domain (/robots.txt) that tells crawlers which paths they may or may not access, following the Robots Exclusion Protocol.

Format: rules are grouped by User-agent followed by Allow and Disallow patterns:

User-agent: Googlebot
Disallow: /cart
Disallow: /checkout
Allow: /

User-agent: GPTBot
Allow: /

User-agent: *
Disallow: /admin
Disallow: /api

For ecommerce in 2026, robots.txt has expanded scope: it now needs explicit policies for AI crawlers (GPTBot, ClaudeBot, anthropic-ai, PerplexityBot, Google-Extended, OAI-SearchBot, ChatGPT-User, CCBot). Most ecommerce stores allow these on marketing routes to maximize AI search citation while disallowing them from cart, checkout, account, and admin paths.

robots.txt is advisory, not enforceable — well-behaved crawlers respect it, malicious ones ignore it. For sensitive content, combine with authentication and noindex meta tags.

Related terms

Sitemap
IndexNow
llms.txt

We value your privacy

We value your privacy

Related terms

Related terms

AEO — Answer Engine Optimization

AggregateRating

AI Overview

We value your privacy

We value your privacy

robots.txt

Related terms

Related terms

Related reading

AEO — Answer Engine Optimization

AggregateRating

AI Overview