← Back to all products

Canonical URL Planner

$19

Canonical URL planning and audit tool for preventing duplicate content issues.

📁 10 files
JSONMarkdownPythonCI/CD

📄 Product Preview

Try the interactive reader and demo tools below, or get the full product with all content unlocked.

📖 Interactive Reader (Free Preview) ⚙ Try Demo Tools 📦 Download Free Sample

📁 File Structure 10 files

canonical-url-planner/ ├── LICENSE ├── README.md ├── examples/ │ ├── pages.csv │ └── pages.json ├── free-sample.zip ├── guide/ │ ├── 01_features.md │ ├── 02_input-formats.md │ └── 03_license.md ├── index.html └── src/ └── canonical_url_planner.py

📖 Documentation Preview README excerpt

Canonical URL Planner

Part of the SEO Toolkit by CodeVault

Plan and validate canonical URLs across your entire site. Detect chains, loops, duplicate content clusters, missing canonicals, protocol mismatches, and cross-domain references — all from a single CSV or JSON audit.

Features

  • Load page/canonical pairs from CSV or JSON
  • Detect canonical chains (A → B → C) and recommend direct canonicals
  • Detect canonical loops (A → B → A)
  • Find missing canonical tags
  • Flag protocol mismatches (HTTP page with HTTPS canonical)
  • Identify cross-domain canonicals
  • Warn about query parameters in canonical URLs
  • Group pages by their canonical target to visualize duplicate content clusters
  • Track self-referencing canonicals (usually correct)
  • Multiple output formats: text, CSV, or JSON
  • Strict mode for CI/CD pipelines
  • Python stdlib only — zero dependencies

Quick Start


# Validate canonical URLs from CSV
python src/canonical_url_planner.py --input examples/pages.csv

# Validate from JSON
python src/canonical_url_planner.py --input examples/pages.json

# JSON output for CI/CD
python src/canonical_url_planner.py --input examples/pages.csv --format json

# Strict mode
python src/canonical_url_planner.py --input examples/pages.csv --strict

# Write report to file
python src/canonical_url_planner.py --input examples/pages.csv --output report.txt

Input Formats

CSV Format


page_url,canonical_url
https://www.example.com/,https://www.example.com/
https://www.example.com/old-page,https://www.example.com/new-page
https://www.example.com/no-canonical,

JSON Format


[
    {"page_url": "https://www.example.com/", "canonical_url": "https://www.example.com/"},
    {"page_url": "https://www.example.com/old-page", "canonical_url": "https://www.example.com/new-page"}
]

... continues with setup instructions, usage examples, and more.

📄 Code Sample .py preview

src/canonical_url_planner.py #!/usr/bin/env python3 """ Canonical URL Planner — SEO Toolkit by DataNest Plan and validate canonical URLs across your site. Detect duplicates, conflicting canonicals, canonical chains, self-referencing issues, HTTP/HTTPS mismatches, and pages missing canonical tags entirely. Why this exists: Canonical tags are deceptively simple — one <link rel="canonical"> per page. But at scale, they become a nightmare. Pages point to the wrong canonical, chains form (A→B→C), duplicate content clusters emerge, and protocol mismatches silently leak crawl budget. This tool audits your entire canonical map in one pass and tells you exactly what to fix. Usage: python canonical_url_planner.py --input pages.csv python canonical_url_planner.py --input pages.json --format json python canonical_url_planner.py --input pages.csv --strict License: MIT """ from __future__ import annotations import argparse import csv import json import logging import sys from collections import defaultdict from dataclasses import dataclass, field from pathlib import Path from typing import Any from urllib.parse import urlparse # --------------------------------------------------------------------------- # Constants # --------------------------------------------------------------------------- LOG = logging.getLogger("canonical-url-planner") # --------------------------------------------------------------------------- # Data Models # --------------------------------------------------------------------------- @dataclass class PageCanonical: """A page and its declared canonical URL.""" # ... 478 more lines ...
Buy Now — $19 Back to Products