Contents

Chapter 1

Chapter 1: Building Sitemaps with the Sitemap Builder

Configuration-Driven URL Generation

The sitemap builder accepts URLs through three input modes: a raw URL list,

a JSON config with priority and change frequency rules, or stdin pipe.

JSON Config Format

json
{
  "base_url": "https://example.com",
  "rules": [
    {"pattern": "/blog/**", "priority": 0.8, "changefreq": "daily"},
    {"pattern": "/docs/**", "priority": 0.7, "changefreq": "weekly"},
    {"pattern": "/products/**", "priority": 0.6, "changefreq": "weekly"},
    {"pattern": "/about", "priority": 0.3, "changefreq": "monthly"},
    {"pattern": "/legal/**", "priority": 0.1, "changefreq": "yearly"}
  ],
  "exclude": ["/admin/**", "/staging/**", "*/draft", "*/temp-*"],
  "default_priority": 0.5,
  "default_changefreq": "monthly"
}

The pattern field supports glob-style matching: ** matches any depth,

* matches within a single path segment. Exclusion patterns take precedence

— a URL matching both an include rule and an exclude pattern is omitted.

Output Structure

The generated sitemap.xml validates against the sitemaps.org schema:

xml
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/blog/deploying-nginx</loc>
    <lastmod>2025-06-15</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.8</priority>
  </url>
</urlset>

Sitemap Index (50k+ URLs)

When the URL count exceeds 50,000, the builder automatically splits into

a sitemap index with child sitemaps:

bash
python src/sitemap_builder.py --urls urls.txt --output sitemap.xml
# Warning: 62,340 URLs exceeds 50,000 limit
# Generated sitemap-index.xml with 2 child sitemaps

The optional --gzip flag compresses each output file, and --stats

prints a summary table with per-priority bucket counts, MIME type

distribution, and estimated index size in kilobytes.

For the full CLI reference and integration examples, see

02_configuration-reference.md.

Chapter 2

Configuration Reference

Follow this guide to get Sitemap Builder up and running in your environment.

Configuration Reference

See examples/sitemap_config.json for a full example.

FieldTypeDescription
base_urlstringBase URL to prepend to relative paths
default_changefreqstringDefault change frequency
default_priorityfloatDefault priority (0.0-1.0)
priority_rulesobjectPath pattern -> priority overrides
changefreq_rulesobjectPath pattern -> changefreq overrides
exclude_patternsarraySubstring patterns to exclude
urlsarrayList of URL paths or full URLs

CLI Flags

FlagDescription
--urls, -uText file with one URL per line
--config, -cJSON config file
--output, -oOutput file path (default: stdout)
--base-url, -bBase URL for relative paths
--default-priorityDefault priority value
--default-changefreqDefault change frequency
--no-prettyCompact XML output
--statsPrint statistics to stderr
--verbose, -vDebug logging
Chapter 3
🔒 Available in full product

03_License

You’ve reached the end of the free preview

Get the full Sitemap Builder and unlock everything.

All Chapters

Get the complete guide with every chapter unlocked, including code samples, diagrams, and best practices.

Full Tool Suite

Access all interactive tools with complete data, all workload profiles, and the full scenario library.

Source Files

Downloadable source code, configuration files, and working examples from every chapter.

Lifetime Updates

Free updates for life. Every new chapter, tool, and improvement included.

Buy Now — $19 →
📦 Free sample included — download another copy for the full product.
Sitemap Builder v1.0.0 — Free Preview