Sitemap URL Extractor

Sitemap URL Extractor

Extract all individual page URLs from any XML sitemap or sitemap index into a clean, flat list. Copy, export to CSV, or download instantly.

link
filter_alt Advanced Filters expand_more

If set, only URLs containing at least one of these strings will be extracted.

Any URL containing these strings will be ignored.

Real World Examples

See how the URL extractor works on live websites. Click any preset to load their sitemaps instantly.

Simple Sitemap

engtools.dev

A small, standard XML sitemap file containing exactly one level of URLs. Fast and straightforward.

Sitemap Index

wordpress.org

WordPress natively uses nested index networks (`sitemap.xml`) which route to page, generic, and post sitemaps. The extractor handles this magically.

Mega Sitemap

shopify.com

Giant e-commerce platforms can contain over 50,000 URLs spread across dozens of fragmented sitemap indexes. Easily handled.

How to extract URLs from a sitemap

1
link

Locate the Sitemap

Paste the exact address of your sitemap.xml or sitemap_index.xml. If you just enter a domain name, we'll auto-append the standard path for you.

2
filter_alt

Apply Filters

Use the Advanced Filters dropdown to specify substrings. Easily hone your list to only pull URLs containing /blog/, or exclude internal /admin/ pages.

3
account_tree

Recursive Unfolding

Once submitted, our backend server fetches your XML securely. If we detect a nested sitemap index, we automatically branch out and fetch every child sitemap recursively.

4
list_alt

Export Data

We strip away all the noisy <lastmod> tags and aggregate the pure URLs into one clean list to visually inspect, copy, or export as TXT/CSV.

Frequently Asked Questions

What is the difference between a Sitemap URL Extractor and a Sitemap Finder? expand_more

Our Sitemap URL Extractor takes a known sitemap URL and returns the raw individual page URLs (like blog posts, products, pages) hidden inside of it. It is designed to harvest pages for migration, auditing, or programmatic scraping in tools like ScreamingFrog.

The Sitemap Finder takes a root domain name and hunts down the sitemap networks themselves. It builds a map of sitemaps, not a map of pages.

Is there a limit to how many URLs I can extract? expand_more
For browser performance and stability, execution is hard-capped at 50,000 URLs per extraction. Most standard tools crash their browser tabs when trying to render a million text lines instantly. We truncate outputs securely and warn you if you hit the ceiling.
Can it handle multiple child sitemaps linked in an index? expand_more
Yes, our backend implements a concurrent breadth-first queue crawler capable of unrolling all child nodes from within a central <sitemapindex> natively.