Free XML Sitemap Generator

Generate XML sitemaps for your website to improve SEO and get easily crawled and indexed by search engines

URLs Found

0.00 MB

File Size

Valid

standard

Type

Crawler Settings

Website URL

Works with both production sites and localhost!

Max Crawl Depth: 3

Shallow (1)Deep (5)

Max URLs to Crawl

Include external links

Sitemap Options

Type

Include last modifiedInclude change frequencyInclude priorityClean URLs

DISCOVERED URLS (0)

Enter a URL and click 'Start Crawling' to discover pages automatically

100% Free Sitemap Generator with Website Crawler

This advanced sitemap generator uses server-side crawling powered by your own Next.js backend - no external API costs, no subscription fees, and completely free for you and your users. Unlike other tools that require API keys or paid services, this tool runs entirely on your own infrastructure using simple, efficient web scraping.

The crawler works on both localhost (perfect for testing during development) and production websites. It uses cheerio for HTML parsing, follows internal links recursively up to your specified depth, respects robots.txt rules, and automatically extracts metadata. You can configure crawl depth (1-5 levels), set URL limits, and choose whether to include external links. The generated sitemaps follow all Google best practices and include automatic validation against the 50,000 URL and 50MB limits.

How to use the free automated crawler

Enter your website URL: Type your website URL - works with production sites (https://yoursite.com) or localhost URLs (http://localhost:3000) for development testing.
Configure crawl settings: Set maximum crawl depth (1-5 levels deep), choose how many URLs to discover (50-5,000), and decide whether to include external links.
Start crawling (100% free!): Click "Start Crawling" and the server-side crawler will discover all pages on your website. Real-time logs show progress including discovered URLs, pages crawled, and any errors.
Review discovered URLs: Once crawling completes, review the list of discovered URLs with their depth levels, last modified dates, and auto-calculated priority scores.
Configure sitemap options: Choose sitemap type and select which XML elements to include (lastmod, changefreq, priority, etc.).
Preview and download: Preview the XML structure, then download your sitemap.xml file ready for upload.
Upload and submit: Upload to your website's root directory and submit to Google Search Console.

Frequently Asked Questions

Is this really 100% free with no hidden costs?

Yes! This tool runs entirely on your Next.js server using free, open-source libraries (cheerio for HTML parsing). There are no external API calls, no subscription fees, and no usage limits beyond your own server resources. It's completely free for you to deploy and free for your users to use.

Does this work on localhost during development?

Yes! You can crawl http://localhost:3000 or any local development URL. This makes it perfect for generating sitemaps while building your website, before deploying to production.

How does the server-side crawler work?

When you click "Start Crawling", your browser sends a request to your own Next.js API route (/api/crawl-website). The server fetches each page's HTML, extracts links using cheerio (a fast HTML parser), and recursively crawls discovered pages up to your specified depth. All processing happens on your server - no external services involved.

What libraries do I need to install?

You'll need to install cheerio (for HTML parsing) and file-saver (for downloading files). Install with: npm install cheerio file-saver. Both are free, open-source libraries with no ongoing costs.

Can this crawl large websites with thousands of pages?

Yes, but be mindful of your server's resources and timeout limits. For very large sites (10,000+ pages), consider increasing the API route timeout or crawling in smaller batches. Vercel's free tier has a 60-second timeout for API routes.

Does it respect robots.txt?

The crawler has a "respect robots.txt" option (enabled by default). When you deploy to production, you may want to implement proper robots.txt parsing. For your own sites during testing, you can disable this option.

What if crawling takes too long?

Each page fetch has a 10-second timeout to prevent hanging. If you're crawling a very large site, reduce the max URLs setting or decrease the crawl depth. The crawler processes pages in batches of 5 to balance speed and server load.

Related SEO Tools

Robots.txt Generator

Control search engine crawler access to your website.

Alt Text Generator

Generate SEO-friendly alt text for images automatically.

Meta Tag Generator

Create SEO-optimized meta tags for better visibility.