SEO
SEOHigh

robots.txt

Checks robots.txt for correct syntax and that it allows search and AI crawlers.

What this check measures

We fetch `/robots.txt`, parse User-agent blocks and Disallow/Allow rules, and specifically verify that Googlebot, Bingbot, GPTBot, ClaudeBot, and PerplexityBot are not accidentally blocked.

Why it matters

robots.txt is the first file every crawler requests. A mistake here (accidentally blocking all bots, missing AI crawler allow rules) can make you invisible to search engines and AI tools entirely.

search

How our audit detects it

GET /robots.txt. Parse User-agent, Disallow, Allow, Sitemap directives. Simulate crawler access for the six most important bots.

Typical findings

  • error_outlineBlanket `Disallow: /` under `User-agent: *` — blocks everything.
  • error_outlineGPTBot, ClaudeBot, PerplexityBot not explicitly allowed — AI search invisibility.
  • error_outline/robots.txt returns 404 — default-allow, but no Sitemap directive either.
  • error_outlineCase-sensitive rules that do not match Google's user-agent matching.

How to fix

Create a permissive robots.txt that allows Google, Bing, and the major AI crawlers, blocks internal paths like /api and /admin, and ends with a Sitemap: line. See the /prompts/add-robots-txt guide for copy-paste variants.

Copy-paste fix prompt for your stack

Lovable · Cursor · Bolt · v0 · Replit · Windsurf · Claude Code · Base44

View the fix prompt →

Frequently asked questions

Does Disallow remove pages from Google?expand_more
No — robots.txt blocks crawling, not indexing. Use noindex meta or HTTP header to remove a URL from the index.
Can I rate-limit crawlers?expand_more
Use `Crawl-delay:` for bots that respect it (Bing, Yandex). Google ignores crawl-delay — use Search Console crawl rate settings instead.

Want this checked on your site?

Pantra runs the full audit (SEO, Security, GEO, Performance, Schema, Technical, Images) in 10 seconds and generates stack-specific fix prompts.

Scan my site

Related checks