Free XML Sitemap Generator
Generate XML sitemaps for your website to improve SEO and get easily crawled and indexed by search engines
Crawler Settings
Works with both production sites and localhost!
Sitemap Options
100% Free Sitemap Generator with Website Crawler
This advanced sitemap generator uses server-side crawling powered by your own Next.js backend - no external API costs, no subscription fees, and completely free for you and your users. Unlike other tools that require API keys or paid services, this tool runs entirely on your own infrastructure using simple, efficient web scraping.
The crawler works on both localhost (perfect for testing during development) and production websites. It uses cheerio for HTML parsing, follows internal links recursively up to your specified depth, respects robots.txt rules, and automatically extracts metadata. You can configure crawl depth (1-5 levels), set URL limits, and choose whether to include external links. The generated sitemaps follow all Google best practices and include automatic validation against the 50,000 URL and 50MB limits.
How to use the free automated crawler
- Enter your website URL: Type your website URL - works with production sites (https://yoursite.com) or localhost URLs (http://localhost:3000) for development testing.
- Configure crawl settings: Set maximum crawl depth (1-5 levels deep), choose how many URLs to discover (50-5,000), and decide whether to include external links.
- Start crawling (100% free!): Click "Start Crawling" and the server-side crawler will discover all pages on your website. Real-time logs show progress including discovered URLs, pages crawled, and any errors.
- Review discovered URLs: Once crawling completes, review the list of discovered URLs with their depth levels, last modified dates, and auto-calculated priority scores.
- Configure sitemap options: Choose sitemap type and select which XML elements to include (lastmod, changefreq, priority, etc.).
- Preview and download: Preview the XML structure, then download your sitemap.xml file ready for upload.
- Upload and submit: Upload to your website's root directory and submit to Google Search Console.
Frequently Asked Questions
Is this really 100% free with no hidden costs?
Yes! This tool runs entirely on your Next.js server using free, open-source libraries (cheerio for HTML parsing). There are no external API calls, no subscription fees, and no usage limits beyond your own server resources. It's completely free for you to deploy and free for your users to use.
Does this work on localhost during development?
Yes! You can crawl http://localhost:3000 or any local development URL. This makes it perfect for generating sitemaps while building your website, before deploying to production.
How does the server-side crawler work?
When you click "Start Crawling", your browser sends a request to your own Next.js API route (/api/crawl-website). The server fetches each page's HTML, extracts links using cheerio (a fast HTML parser), and recursively crawls discovered pages up to your specified depth. All processing happens on your server - no external services involved.
What libraries do I need to install?
You'll need to install cheerio (for HTML parsing) and file-saver (for downloading files). Install with: npm install cheerio file-saver. Both are free, open-source libraries with no ongoing costs.
Can this crawl large websites with thousands of pages?
Yes, but be mindful of your server's resources and timeout limits. For very large sites (10,000+ pages), consider increasing the API route timeout or crawling in smaller batches. Vercel's free tier has a 60-second timeout for API routes.
Does it respect robots.txt?
The crawler has a "respect robots.txt" option (enabled by default). When you deploy to production, you may want to implement proper robots.txt parsing. For your own sites during testing, you can disable this option.
What if crawling takes too long?
Each page fetch has a 10-second timeout to prevent hanging. If you're crawling a very large site, reduce the max URLs setting or decrease the crawl depth. The crawler processes pages in batches of 5 to balance speed and server load.