← Back to all products
$29
Structured Data Validator
Validate JSON-LD, Microdata, and RDFa structured data against Schema.org standards.
JSONMarkdownPythonCI/CD
📄 Product Preview
Try the interactive reader and demo tools below, or get the full product with all content unlocked.
📖 Interactive Reader (Free Preview) ⚙ Try Demo Tools 📦 Download Free Sample📁 File Structure 10 files
structured-data-validator/
├── LICENSE
├── README.md
├── examples/
│ ├── sample_jsonld.json
│ └── sample_page.html
├── free-sample.zip
├── guide/
│ ├── 01_features.md
│ ├── 02_schema-types-supported.md
│ └── 03_license.md
├── index.html
└── src/
└── structured_data_validator.py
📖 Documentation Preview README excerpt
Structured Data Validator
Part of the SEO Toolkit by CodeVault
Validate JSON-LD structured data against Schema.org type specs — locally, offline, and CI/CD-ready. Catches missing required fields, type mismatches, invalid contexts, and common markup errors that break Google rich results.
Features
- Validates JSON-LD files (.json) and HTML files with embedded
blocks - Checks 16+ Schema.org types: Article, Product, FAQPage, BreadcrumbList, Organization, LocalBusiness, WebSite, Event, Recipe, VideoObject, and more
- Validates required fields per type (e.g., Article needs headline, datePublished, author)
- Checks recommended fields and suggests additions for richer results
- Validates field values: URL formats, ISO 8601 dates, empty strings
- Recursively validates nested objects (publisher, author, etc.)
- Multiple output formats: human-readable text or JSON report
- Strict mode for CI/CD pipelines (exit 1 on any warning)
- Python stdlib only — zero dependencies
Quick Start
# Validate a JSON-LD file
python src/structured_data_validator.py --input examples/sample_jsonld.json
# Validate JSON-LD embedded in an HTML page
python src/structured_data_validator.py --input examples/sample_page.html
# Get a JSON report (great for CI/CD)
python src/structured_data_validator.py --input examples/sample_jsonld.json --format json
# Strict mode — fail on any warning
python src/structured_data_validator.py --input examples/sample_jsonld.json --strict
# Write report to file
python src/structured_data_validator.py --input examples/sample_jsonld.json --output report.txt
Schema Types Supported
| Type | Required Fields | Recommended Fields |
|---|---|---|
| Article | headline, datePublished, author | dateModified, image, publisher, description |
| Product | name | description, image, sku, brand, offers |
| FAQPage | mainEntity | — |
| BreadcrumbList | itemListElement | — |
| Organization | name | url, logo, contactPoint, sameAs |
| LocalBusiness | name, address | telephone, openingHours, geo, url |
| WebSite | name, url | potentialAction, description |
| Event | name, startDate, location | endDate, description, image, organizer |
| Recipe | name, recipeIngredient, recipeInstructions | image, author, prepTime, cookTime |
| VideoObject | name, description, thumbnailUrl, uploadDate | contentUrl, duration, embedUrl |
| Person | name | url, image, jobTitle |
| SoftwareApplication | name | operatingSystem, applicationCategory, offers |
CLI Flags
| Flag | Description |
|---|---|
--input, -i | Path to JSON-LD or HTML file (required) |
--format, -f | Output format: text or json (default: text) |
... continues with setup instructions, usage examples, and more.
📄 Code Sample .py preview
src/structured_data_validator.py
#!/usr/bin/env python3
"""
Structured Data Validator — SEO Toolkit by DataNest
Validate JSON-LD structured data against common Schema.org type specs.
Catches missing required fields, type mismatches, invalid @context values,
and dozens of common markup errors that break rich results in Google Search.
Why this exists:
Google's Rich Results Test is great, but it's online-only, requires
a live URL, and doesn't integrate into CI/CD. This tool validates
JSON-LD locally — feed it a .json file or an HTML page with embedded
<script type="application/ld+json"> blocks, and get instant feedback
on what's wrong and how to fix it.
Usage:
python structured_data_validator.py --input schema.json
python structured_data_validator.py --input page.html
python structured_data_validator.py --input schema.json --format json
python structured_data_validator.py --input schema.json --strict
License: MIT
"""
from __future__ import annotations
import argparse
import json
import logging
import re
import sys
from dataclasses import dataclass, field
from html.parser import HTMLParser
from pathlib import Path
from typing import Any
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
SCHEMA_CONTEXT = "https://schema.org"
VALID_CONTEXTS = {
"https://schema.org",
"https://schema.org/",
"http://schema.org",
"http://schema.org/",
}
# Schema.org type definitions: type -> (required_fields, recommended_fields)
# These cover the most common rich-result types Google supports.
# ... 578 more lines ...