Databricks Starter Kit

$29

Production-ready starter templates for building data platforms on Databricks with Unity Catalog and Delta Lake: workspace bootstrap, medallion layout, and job scaffolding.

📁 27 files

YAMLMarkdownPythonSQLTerraformAWSAzureGCPDatabricksSpark

📄 Product Preview

Try the interactive reader and demo tools below, or get the full product with all content unlocked.

📖 Interactive Reader (Free Preview) ⚙ Try Demo Tools 📦 Download Free Sample

📁 File Structure 27 files

databricks-starter-kit/ ├── LICENSE ├── README.md ├── cicd_templates/ │ ├── azure_devops_pipeline.yml │ ├── deploy_notebooks.py │ ├── github_actions_workflow.yml │ └── run_tests.py ├── config/ │ ├── environment.py │ ├── logging_config.py │ └── secrets.py ├── free-sample.zip ├── guide/ │ ├── 01_what-s-inside.md │ ├── 02_module-deep-dives.md │ └── 03_frequently-asked-questions.md ├── index.html ├── ingestion_templates/ │ ├── api_ingestion.py │ ├── base_pipeline.py │ ├── database_ingestion.py │ ├── file_ingestion.py │ └── streaming_ingestion.py ├── medallion_bootstrap/ │ ├── 01_create_catalogs.py │ ├── 02_create_schemas.py │ ├── 03_grant_permissions.py │ └── config.py └── unity_catalog_setup/ ├── data_governance_policies.md ├── setup_catalogs.sql ├── setup_credentials.sql └── setup_external_locations.sql

📖 Documentation Preview README excerpt

Databricks Starter Kit

Production-ready templates for building data platforms on Databricks with Unity Catalog and Delta Lake.

Skip the months of trial and error. This kit gives you the same patterns and architecture used by data platform teams at scale — fully documented, customizable, and ready to deploy.

What's Inside

Module	What It Does
config/	Environment detection, secret management, structured logging
medallion_bootstrap/	One-command setup of your bronze/silver/gold catalog structure with RBAC
ingestion_templates/	Battle-tested pipelines for APIs, databases, files, and streaming
cicd_templates/	Azure DevOps & GitHub Actions pipelines, deployment scripts, test runner
unity_catalog_setup/	SQL scripts for catalogs, external locations, credentials, and governance

21 files — every one fully runnable, type-hinted, and documented.

Quick Start

1. Upload to Databricks

Upload this kit to a Databricks Repo or the workspace file system:


/Repos/<your-user>/databricks-starter-kit/

Or use the Databricks CLI:


databricks repos create \
  --url https://github.com/your-org/databricks-starter-kit \
  --provider github

2. Configure Your Environment

Edit config/environment.py to register your workspace IDs:


WORKSPACE_ENV_MAP: dict[str, str] = {
    "1234567890123456": "dev",      # Your dev workspace org ID
    "2345678901234567": "staging",  # Your staging workspace
    "3456789012345678": "prod",     # Your production workspace
}

Update storage account names and catalog prefixes to match your naming conventions.

3. Bootstrap the Medallion Architecture

Edit medallion_bootstrap/config.py with your catalog names and team groups, then run:


# In a Databricks notebook — run these in order:

*... continues with setup instructions, usage examples, and more.*

📄 Code Sample .py preview

cicd_templates/deploy_notebooks.py """ Databricks Notebook & Workflow Deployer ======================================== Deploys notebooks and workflow definitions to Databricks workspaces using the Databricks REST API. Designed to be called from CI/CD pipelines (Azure DevOps, GitHub Actions) or run manually. Features: - Deploy/update workflow definitions from JSON files - Update Databricks Repos to a specific branch/tag - Import individual notebooks via the Workspace API - Service principal authentication (AAD token or PAT) - Dry-run mode for safe previewing - JSON validation before deployment Usage (CLI): # Deploy a workflow definition python deploy_notebooks.py \\ --workspace-url https://adb-1234567890.azuredatabricks.net \\ --workflow-json workflows/my_etl_job.json \\ --action deploy # Update a Databricks Repo to a branch python deploy_notebooks.py \\ --workspace-url https://adb-1234567890.azuredatabricks.net \\ --repo-path /Repos/production/my-repo \\ --branch main \\ --action update-repo # Dry run (preview without deploying) python deploy_notebooks.py \\ --workspace-url https://adb-1234567890.azuredatabricks.net \\ --workflow-json workflows/my_etl_job.json \\ --action deploy \\ --dry-run Usage (Python): from cicd_templates.deploy_notebooks import DatabricksDeployer deployer = DatabricksDeployer( workspace_url="https://adb-1234567890.azuredatabricks.net", token="dapi...", # or use env vars ) deployer.deploy_workflow("workflows/my_etl_job.json") Compatible with: Python 3.10+ """ # ... 432 more lines ...

Buy Now — $29 Back to Products