← Back to all products

Sitemap Builder

$19

XML sitemap generator with automatic discovery, priority assignment, and changefreq configuration.

📁 9 files
JSONMarkdownPython

📄 Product Preview

Try the interactive reader and demo tools below, or get the full product with all content unlocked.

📖 Interactive Reader (Free Preview) ⚙ Try Demo Tools 📦 Download Free Sample

📁 File Structure 9 files

sitemap-builder/ ├── LICENSE ├── README.md ├── examples/ │ └── sitemap_config.json ├── free-sample.zip ├── guide/ │ ├── 01_features.md │ ├── 02_configuration-reference.md │ └── 03_license.md ├── index.html └── src/ └── sitemap_builder.py

📖 Documentation Preview README excerpt

Sitemap Builder

Part of the SEO Toolkit by CodeVault

Generate standards-compliant sitemap.xml files from URL lists or JSON configs. Supports priority rules, change frequencies, exclusion patterns, and sitemap indexes.

Features

  • Generates valid sitemap.xml per the sitemaps.org protocol
  • Configurable priority rules (e.g., /blog/ = 0.8, /legal/ = 0.1)
  • Configurable changefreq rules (e.g., blog = daily, legal = yearly)
  • URL exclusion patterns (skip admin pages, staging URLs, etc.)
  • Accepts URLs from a text file, JSON config, or stdin
  • Pretty-printed or compact XML output
  • Statistics mode: shows URL counts and output size
  • Warns when exceeding the 50,000 URL sitemap limit
  • Python stdlib only — zero dependencies

Quick Start


# From a URL list file
python src/sitemap_builder.py --urls urls.txt --output sitemap.xml

# From a JSON config (with rules and priorities)
python src/sitemap_builder.py --config examples/sitemap_config.json --output sitemap.xml

# Pipe URLs from another command
curl -s https://api.example.com/urls | python src/sitemap_builder.py --output sitemap.xml

# With stats
python src/sitemap_builder.py --config examples/sitemap_config.json --output sitemap.xml --stats

Configuration Reference

See examples/sitemap_config.json for a full example.

FieldTypeDescription
base_urlstringBase URL to prepend to relative paths
default_changefreqstringDefault change frequency
default_priorityfloatDefault priority (0.0-1.0)
priority_rulesobjectPath pattern -> priority overrides
changefreq_rulesobjectPath pattern -> changefreq overrides
exclude_patternsarraySubstring patterns to exclude
urlsarrayList of URL paths or full URLs

CLI Flags

FlagDescription
--urls, -uText file with one URL per line
--config, -cJSON config file
--output, -oOutput file path (default: stdout)
--base-url, -bBase URL for relative paths
--default-priorityDefault priority value
--default-changefreqDefault change frequency
--no-prettyCompact XML output
--statsPrint statistics to stderr

... continues with setup instructions, usage examples, and more.

📄 Code Sample .py preview

src/sitemap_builder.py #!/usr/bin/env python3 """ Sitemap Builder — SEO Toolkit by DataNest Reads a list of URLs (from a file, JSON config, or stdin) and generates a standards-compliant sitemap.xml with configurable priorities, change frequencies, and last-modified dates. Why this exists: Most sitemap generators require installing a full CMS plugin or a heavy npm package. This is a single Python script — feed it your URLs and get a valid sitemap.xml in seconds. Perfect for static sites, JAMStack builds, or any CI/CD pipeline. Usage: python sitemap_builder.py --urls urls.txt --output sitemap.xml python sitemap_builder.py --config sitemap_config.json --output sitemap.xml cat urls.txt | python sitemap_builder.py --output sitemap.xml License: MIT """ from __future__ import annotations import argparse import json import logging import sys import xml.etree.ElementTree as ET from dataclasses import dataclass, field from datetime import datetime, timezone from pathlib import Path from typing import TextIO from xml.dom.minidom import parseString # --------------------------------------------------------------------------- # Constants # --------------------------------------------------------------------------- # Maximum URLs per sitemap (search engine limit) MAX_URLS_PER_SITEMAP = 50_000 # Maximum sitemap file size (50 MB uncompressed) MAX_SITEMAP_SIZE_BYTES = 50 * 1024 * 1024 # Valid changefreq values per the sitemaps.org protocol VALID_CHANGEFREQS = { "always", "hourly", "daily", "weekly", "monthly", "yearly", "never", } # ... 359 more lines ...
Buy Now — $19 Back to Products