Shopify Robots.txt Analyzer
Audit your Shopify robots.txt for crawl-blocking mistakes — free, no signup.
What this tool checks
- Whether /robots.txt is reachable on your domain and what it returns.
- Every User-agent block, plus its Allow and Disallow rules — colour-coded green for allow, red for disallow.
- Whether your robots.txt blocks /collections/, /products/, /blogs/ or /pages/ — the four URL prefixes Shopify stores need crawled.
- Whether a Sitemap: directive is declared (Shopify normally injects this automatically; missing usually means a custom robots.txt.liquid is overriding the default).
- crawl-delay values for major search engine bots (Googlebot, Bingbot, DuckDuckBot, Yandex, Applebot) and AI crawlers (GPTBot, OAI-SearchBot, PerplexityBot).
- Catastrophic mistakes like Disallow: / under User-agent: * — which would deindex the entire store.
Why it matters for Shopify stores
robots.txt is the first file Google fetches for any domain. Every wrong rule has compounding cost — a single line of bad regex can prevent thousands of product pages from being crawled. Worse, the impact is silent: Google stops crawling, the URLs drop out of the index over weeks, and traffic disappears with no error message in Search Console.
Shopify ships a sensible default robots.txt that blocks /admin, /cart, /checkout, /search and other transactional URLs while allowing /products, /collections, /blogs and /pages. The default is correct for almost every store. Problems start when merchants edit the file (Shopify allows custom robots.txt.liquid) and accidentally over-block, or when a developer copies a robots.txt from another platform that has different URL conventions.
AI crawlers are also a fast-moving consideration. GPTBot, OAI-SearchBot, PerplexityBot and ClaudeBot all read robots.txt before deciding to fetch your pages. If you want your products surfaced in ChatGPT shopping results or Perplexity answers, you must allow these user-agents — and many merchants accidentally block them while trying to block scraping bots.
Finally, robots.txt is a hint, not a hard ACL. Disallow does not stop a page from being indexed if other sites link to it — Google may still show the URL with a "no description available" snippet. To remove a page from search results properly you need a noindex meta tag (which requires the page to be crawlable). Blocking via robots.txt only is rarely the right tool.
How to fix this in Shopify admin
- 1
Locate your robots.txt.liquid file
In Shopify Admin → Online Store → Themes → Actions → Edit code. Under Templates → robots.txt.liquid. If this file does not exist, Shopify is serving the default robots.txt (which is usually correct — proceed only if you know you need a custom one).
- 2
Remove blanket Disallow rules
Search the file for "Disallow: /" or "Disallow: /*". If either appears under "User-agent: *" delete it immediately. The Shopify default Disallow list targets specific paths like /cart and /admin, never the root.
- 3
Restore product, collection and blog crawling
Ensure no Disallow rule blocks /products, /collections, /blogs or /pages — the four prefixes that hold your indexable content. If you find one, remove it. Save robots.txt.liquid; the new file deploys instantly.
- 4
Verify the Sitemap: directive is present
Shopify automatically appends a Sitemap: https://yourdomain.com/sitemap.xml line. If you are using a custom robots.txt.liquid, ensure the file ends with that directive — otherwise some search engines will not discover your sitemap.
Common Shopify mistakes
Custom robots.txt.liquid overriding the safe default
A developer creates robots.txt.liquid to "tighten security" and ends up blocking far more than the default. If you are not sure why the file exists, delete it — Shopify will revert to its built-in robots.txt which is correct for nearly every store.
Disallowing /collections/all or paginated collections
Some merchants block /collections/all or pagination URLs (?page=2) hoping to consolidate ranking signals. This is the wrong tool — use canonical tags instead. Blocking pagination prevents Google from discovering products that only appear on later pages.
Accidentally blocking AI crawlers
Adding User-agent: GPTBot Disallow: / removes your store from ChatGPT shopping suggestions, ChatGPT Search, and any future AI commerce features powered by OpenAI. Allow these bots unless you have a clear reason not to.
Crawl-delay over 5 seconds
Some merchants set crawl-delay to slow down scrapers. Major bots like Bingbot and Yandexbot will obey, dramatically reducing how often your new products and price changes get indexed. Keep crawl-delay under 5 (or omit it entirely).
Missing Sitemap: directive
A custom robots.txt.liquid that forgets to include "Sitemap: https://yourdomain.com/sitemap.xml" hides the sitemap from search engines that read robots.txt for sitemap discovery (which is most of them).
Pointing Sitemap: at a non-existent path
Some merchants migrating from WordPress paste in "Sitemap: /sitemap_index.xml" — Shopify stores their sitemap at /sitemap.xml. Always verify the path resolves before relying on it.
Frequently asked questions
Where does Shopify store robots.txt?
+
Shopify generates a default robots.txt at https://yourdomain.com/robots.txt. You can override it by creating a robots.txt.liquid file at Online Store → Themes → Edit code → Templates. Most stores should leave this alone.
Will blocking a page in robots.txt remove it from Google?
+
Not reliably. Robots.txt prevents Google from crawling the URL, but if other sites link to it Google may still show it in results with a generic snippet. To remove a page properly, allow it to be crawled and add <meta name="robots" content="noindex"> in the page head.
What does a healthy Shopify robots.txt look like?
+
It allows /products, /collections, /blogs, /pages. It disallows /cart, /checkout, /admin, /orders, /customers, /search, and the various JSON endpoints. It includes one Sitemap: directive pointing at https://yourdomain.com/sitemap.xml.
Should I block GPTBot, ClaudeBot or other AI crawlers?
+
Probably not. AI search engines like ChatGPT Search and Perplexity are increasingly important traffic sources for e-commerce. Blocking these bots removes your products from any AI-driven shopping results. Allow them unless you have a specific licensing or scraping concern.
Why is my robots.txt blocking my whole site?
+
Almost always a custom robots.txt.liquid that contains "User-agent: *" followed by "Disallow: /". Edit the theme file and remove that block — the default Shopify file does not have it.
How often does Google re-read robots.txt?
+
Google fetches robots.txt every 24 hours for active domains. Changes propagate within a day. Bing and other engines may take up to a week. Use Search Console → Settings → robots.txt report to see the most recent fetched copy.
Can a Shopify app change my robots.txt?
+
Some SEO apps (and rarely, theme apps) edit robots.txt.liquid. If your robots.txt suddenly contains rules you did not write, check Online Store → Themes → Edit code → Templates → robots.txt.liquid for recent edits and review your installed apps.
Does this tool fetch the live robots.txt?
+
Yes. We make a fresh GET request to /robots.txt on the host you submit, parse the rules, and check them against Shopify-specific best practices. The data shown reflects what crawlers are seeing right now, not a cache.
Want the full Shopify SEO audit?
This tool checks one thing. SEOScan checks 60+ Shopify-specific issues with AI-powered fix guides. Free scan, no signup.
Run Full Shopify SEO Scan