← Back to all products
$49
ML Pipeline Templates
End-to-end ML pipelines: data ingestion, preprocessing, training, evaluation, and deployment orchestration.
MarkdownYAMLJSONDockerCI/CDAirflow
📁 File Structure 8 files
ml-pipeline-templates/
├── LICENSE
├── README.md
├── config.example.yaml
├── docs/
│ ├── checklists/
│ │ └── pre-deployment.md
│ ├── overview.md
│ └── patterns/
│ └── pattern-01-train-evaluate-deploy.md
└── templates/
└── config.yaml
📖 Documentation Preview README excerpt
ML Pipeline Templates
End-to-end ML pipeline templates covering the full lifecycle: data ingestion, preprocessing, feature engineering, model training, evaluation, and deployment. Built for reproducibility and automation.
What's Included
- Data ingestion pipeline templates (batch and streaming)
- Preprocessing and feature engineering stages
- Training pipeline with hyperparameter configuration
- Model evaluation and comparison stages
- Deployment pipeline with validation gates
- Orchestration configs for Airflow, Prefect, and Kubeflow
- Pipeline testing and CI/CD integration
Quick Start
# 1. Copy the example config
cp config.example.yaml config.yaml
# 2. Install dependencies
pip install -r requirements.txt
# 3. Run the example pipeline locally
python -m pipelines.train_pipeline --config config.yaml
Prerequisites
- Python 3.9+
- Pipeline orchestrator (Airflow, Prefect, or Kubeflow)
- Cloud storage for artifacts (S3/GCS)
- Docker (for containerized pipeline steps)
Contents
ml-pipeline-templates/
config.example.yaml
docs/
overview.md
patterns/
pattern-01-*.md
checklists/
pre-deployment.md
templates/
config.yaml
Support
For questions or issues, contact: megafolder122122@hotmail.com
License
MIT License - Copyright 2026 Jesse Mikkola. See LICENSE for details.
📄 Code Sample .yaml preview
config.example.yaml
# ML Pipeline Templates - Example Configuration
# Copy this file to config.yaml and update values for your environment
pipeline:
name: "training-pipeline"
orchestrator: "prefect" # airflow, prefect, kubeflow
schedule: null # "0 0 * * *" for daily
data:
source:
type: "file" # file, database, api, s3
path: "./data/raw/"
preprocessing:
steps:
- "remove_nulls"
- "normalize_features"
- "encode_categoricals"
split:
train: 0.7
validation: 0.15
test: 0.15
seed: 42
training:
model_type: "sklearn" # sklearn, pytorch, tensorflow
algorithm: "random_forest"
params:
n_estimators: 100
max_depth: 10
epochs: null # For deep learning models
evaluation:
metrics:
- "accuracy"
- "f1_score"
- "precision"
- "recall"
threshold:
min_accuracy: 0.85
deployment:
enabled: false
target: "local" # local, kubernetes, sagemaker
logging:
level: "INFO"