Contents

Chapter 1

Features

This chapter covers the core features and capabilities of Data Sync.

Features

  • Hash-based diff engine — SHA-256 content hashing detects added, changed, deleted, and unchanged records
  • Multi-format endpoints — Read and write JSON files, CSV files, and SQLite tables
  • Conflict resolution — Three strategies: source-wins, dest-wins, newest-wins (by timestamp field)
  • Incremental sync — Persist state between runs to detect true conflicts (both sides changed)
  • Dry-run mode — Preview the sync plan without writing any changes
  • Bidirectional sync — Propagate dest-only records back to source instead of deleting them
  • Detailed reporting — Diff report with per-record change breakdown and summary statistics
  • Config file support — Define sync jobs in JSON for repeatable, scriptable execution

Requirements

  • Python 3.10+
  • No external dependencies (stdlib only)
Chapter 2

Quick Start

Follow this guide to get Data Sync up and running in your environment.

Quick Start

bash
# Sync two JSON files by a key field
python src/data_sync.py --source customers_a.json --dest customers_b.json --key customer_id

# Sync CSV to SQLite with conflict strategy
python src/data_sync.py --source orders.csv --dest warehouse.db --key order_id --strategy source-wins

# Preview changes without writing (dry-run)
python src/data_sync.py --source new_data.json --dest master.json --key id --dry-run

# Incremental sync with state tracking
python src/data_sync.py --source export.json --dest mirror.json --key id --state sync_state.json

# Full pipeline from a config file
python src/data_sync.py --config examples/data_sync_config.json

Configuration Reference

Define a sync job in JSON:

json
{
    "source": "customers_export.json",
    "dest": "customers_master.json",
    "key": "customer_id",
    "strategy": "newest-wins",
    "timestamp_field": "updated_at",
    "state": "sync_state.json",
    "dry_run": false,
    "bidirectional": false,
    "source_table": "customers",
    "dest_table": "customers"
}

CLI Flags

FlagDefaultDescription
--source, -s—Source file path (.json, .csv, .db, .sqlite)
--dest, -d—Destination file path (.json, .csv, .db, .sqlite)
--key, -kidRecord key field used for matching
--strategysource-winsConflict resolution: source-wins, dest-wins, newest-wins
--timestamp-fieldupdated_atTimestamp field for newest-wins strategy
--state—State file path for incremental sync (.json)
--dry-runoffShow the sync plan without writing changes
--bidirectionaloffSync changes in both directions
--source-tabledataSQLite table name for source
--dest-tabledataSQLite table name for destination
--config, -c—JSON config file (overrides other flags)
--log-levelINFODEBUG, INFO, WARNING, ERROR

Config Schema

KeyTypeRequiredDescription
sourcestringyesSource file path
deststringyesDestination file path
keystringnoKey field (default: id)
strategystringnosource-wins / dest-wins / newest-wins
timestamp_fieldstringnoField for newest-wins comparison
statestringnoPath to state file for incremental sync
dry_runboolnoPreview-only mode
bidirectionalboolnoTwo-way sync
source_tablestringnoSQLite table name (source)
dest_tablestringnoSQLite table name (dest)
Chapter 3
🔒 Available in full product

Output

Chapter 4
🔒 Available in full product

04_License

You’ve reached the end of the free preview

Get the full Data Sync and unlock everything.

All Chapters

Get the complete guide with every chapter unlocked, including code samples, diagrams, and best practices.

Full Tool Suite

Access all interactive tools with complete data, all workload profiles, and the full scenario library.

Source Files

Downloadable source code, configuration files, and working examples from every chapter.

Lifetime Updates

Free updates for life. Every new chapter, tool, and improvement included.

Buy Now — $19 →
📦 Free sample included — download another copy for the full product.
Data Sync v1.0.0 — Free Preview