← Back to all products
€29
Databricks Starter Kit
Production-ready templates for building data platforms on Databricks with Unity Catalog and Delta Lake. Skip the months of trial and error — get the same patterns used by data platform teams at scale.
PythonDatabricksDelta LakeUnity CatalogAzure DevOpsGitHub Actions
📁 File Structure 21 files
databricks-starter-kit/
├── README.md
├── LICENSE
│
├── config/
│ ├── environment.py
│ ├── secrets.py
│ └── logging_config.py
│
├── medallion_bootstrap/
│ ├── config.py
│ ├── 01_create_catalogs.py
│ ├── 02_create_schemas.py
│ └── 03_grant_permissions.py
│
├── ingestion_templates/
│ ├── base_pipeline.py
│ ├── api_ingestion.py
│ ├── database_ingestion.py
│ ├── file_ingestion.py
│ └── streaming_ingestion.py
│
├── cicd_templates/
│ ├── azure_devops_pipeline.yml
│ ├── github_actions_workflow.yml
│ ├── deploy_notebooks.py
│ └── run_tests.py
│
└── unity_catalog_setup/
├── setup_catalogs.sql
├── setup_external_locations.sql
├── setup_credentials.sql
└── data_governance_policies.md
📖 Documentation Preview README excerpt
What's Inside
- config/ — Environment detection, secret management, structured logging
- medallion_bootstrap/ — One-command setup of your bronze/silver/gold catalog structure with RBAC
- ingestion_templates/ — Battle-tested pipelines for APIs, databases, files, and streaming
- cicd_templates/ — Azure DevOps & GitHub Actions pipelines, deployment scripts, test runner
- unity_catalog_setup/ — SQL scripts for catalogs, external locations, credentials, and governance
Quick Start
# 1. Upload to a Databricks Repo
databricks repos create \
--url https://github.com/your-org/databricks-starter-kit \
--provider github
# 2. Configure environment.py with your workspace IDs
# 3. Bootstrap the medallion architecture
%run ./medallion_bootstrap/01_create_catalogs
%run ./medallion_bootstrap/02_create_schemas
%run ./medallion_bootstrap/03_grant_permissions
Requirements
- Databricks Runtime 13.x or later
- Python 3.10+
- Unity Catalog enabled on workspace
- Delta Lake (included with DBR 13.x+)
📄 Code Sample .py preview
config/environment.py
"""
Environment Configuration for Databricks Pipelines
====================================================
Provides environment-aware configuration using dataclasses.
Automatically detects the current Databricks workspace and
returns the appropriate environment settings (dev, staging, prod).
Usage (in a Databricks notebook):
from config.environment import get_environment
env = get_environment()
print(env.name) # "dev"
print(env.catalog_prefix) # "dev"
print(env.storage_root) # "abfss://raw@styourorgdev..."
"""
from __future__ import annotations
from dataclasses import dataclass, field
from enum import Enum
from typing import Optional
class EnvironmentName(str, Enum):
"""Supported environment names."""
DEV = "dev"
STAGING = "staging"
PROD = "prod"
@dataclass(frozen=True)
class StorageConfig:
... remaining implementation in full product