Robots.txt Tester
Test and validate robots.txt files. Check which URLs are blocked or allowed for any crawler.
Enter any URL — we'll automatically fetch /robots.txt from that domain
What is robots.txt and Why Does It Matter?
The robots.txt file is a standard text file placed at the root of your website (e.g., https://example.com/robots.txt) that tells search engine crawlers which pages or sections of your site they are allowed or not allowed to visit. It follows the Robots Exclusion Protocol, a standard used by all major search engines including Google, Bing, and Yahoo.
How does this tool work?
- Fetch & Parse: Enter any URL and the tool automatically fetches the site's robots.txt, then parses all directives grouped by user-agent.
- URL Testing: Type any URL path and select a user-agent to instantly see if that path is allowed or blocked, with the matching rule highlighted.
- Validation: The tool flags common mistakes like missing
User-agent: *, typos in directives (e.g.,Dissallow), and overly broad blocking rules. - Sitemap Discovery: Extracts and displays all
Sitemap:references so you can verify your sitemaps are correctly declared.
Frequently Asked Questions
How do I test my robots.txt file?
Enter your website URL in the tool above. It will fetch your robots.txt, parse all directives, and let you test specific URL paths against any user-agent to see if they are allowed or blocked.
What does "Disallow: /" mean in robots.txt?
Disallow: / tells the specified crawler not to access any page on your site. Under a User-agent: * block, this blocks all search engines from crawling your entire site. This is commonly used on staging or development environments.
Why is my page not being indexed by Google?
Your robots.txt may be blocking Googlebot from crawling the page. Use the URL Tester above to check if your page's path is blocked. Common causes include overly broad Disallow rules or missing Allow exceptions for specific paths.
What's the difference between Allow and Disallow?
Disallow tells crawlers not to access a path. Allow creates an exception within a Disallow rule. For example, you can Disallow: /private/ but Allow: /private/public-page. The most specific matching rule wins.
robots.txt Examples & Patterns
These ready-to-use robots.txt examples cover the most common configurations. Click any example to load it into the tester above.
Block all crawlers
Use on staging and development environments to prevent search engines from indexing your site.
User-agent: *
Disallow: /
Test it: Enter any path like /page for user-agent * → BLOCKED
Allow only Googlebot, block everything else
A common pattern on soft-launch sites — indexed by Google but invisible to other crawlers and scrapers.
User-agent: *
Disallow: /
User-agent: Googlebot
Disallow:
Test it: Path /blog/post for Googlebot → ALLOWED. Same path for Bingbot → BLOCKED
Typical website configuration
Blocks admin and internal areas while allowing all public content. Includes a Sitemap reference — best practice for SEO.
User-agent: *
Disallow: /admin/
Disallow: /login/
Disallow: /private/
Allow: /private/press-kit/
Crawl-delay: 2
Sitemap: https://example.com/sitemap.xml
Test it: Path /private/press-kit/brochure.pdf → ALLOWED (the Allow rule overrides the Disallow)
E-commerce site
Protects cart, checkout, and account pages from indexing while keeping product and category pages fully crawlable.
User-agent: *
Disallow: /cart/
Disallow: /checkout/
Disallow: /account/
Disallow: /order-confirmation/
Disallow: /search?*
Allow: /products/
Allow: /categories/
User-agent: GPTBot
Disallow: /
Sitemap: https://shop.example.com/sitemap.xml
Sitemap: https://shop.example.com/sitemap-products.xml
Test it: Path /search?q=shoes → BLOCKED (wildcard * after ?). Path /products/running-shoes → ALLOWED
Common mistakes — can you spot them? ⚠
This example contains 3 real errors that will silently break your robots.txt. Load it to see our validator flag each one.
# No User-agent: * catch-all!
Disallow: /admin/
User-agent: Googlebot
Dissallow: /private/
Allow: /private/press/
Sitemap: https://example.com/sitemap.xml
- Typo:
Dissallowinstead ofDisallow— the rule is silently ignored - Missing catch-all: No
User-agent: *group means other crawlers get no guidance - Rule before User-agent:
Disallow: /admin/appears outside any group — it's ignored