This chapter covers the core features and capabilities of Fine-Tuning Pipeline.
# Run demo with sample data
python src/fine_tuning_pipeline.py --demo
# Run the full pipeline: clean → convert → validate → split
python src/fine_tuning_pipeline.py --input raw_data.jsonl --output prepared/
# Validate a dataset
python src/fine_tuning_pipeline.py --validate dataset.jsonl
# Show dataset statistics
python src/fine_tuning_pipeline.py --stats dataset.jsonl
# Split with custom ratio
python src/fine_tuning_pipeline.py --split dataset.jsonl --ratio 0.9 --output prepared/Follow this guide to get Fine-Tuning Pipeline up and running in your environment.
fine-tuning-pipeline/
├── README.md
├── LICENSE
├── src/
│ └── fine_tuning_pipeline.py # Core engine (~430 lines)
└── examples/
├── basic_usage.py # Programmatic usage example
└── sample_training_data.jsonl # Sample data in mixed formats
| Flag | Description | |
|---|---|---|
--demo | Run demo with sample data | |
--input FILE | Input data file (JSONL) | |
--output DIR | Output directory (default: ./prepared) | |
--validate FILE | Validate a dataset file | |
--stats FILE | Show dataset statistics | |
--split FILE | Split a dataset into train/test | |
--ratio FLOAT | Train/test split ratio (default: 0.8) | |
| `--format chat\ | completion` | Target output format (default: chat) |
--no-clean | Skip text cleaning | |
--seed INT | Random seed for reproducible splits |
Get the full Fine-Tuning Pipeline and unlock everything.
Get the complete guide with every chapter unlocked, including code samples, diagrams, and best practices.
Access all interactive tools with complete data, all workload profiles, and the full scenario library.
Downloadable source code, configuration files, and working examples from every chapter.
Free updates for life. Every new chapter, tool, and improvement included.