A configurable Extract-Transform-Load pipeline framework built on Python stdlib. Pull data from JSON files, CSV files, SQLite databases, or HTTP APIs. Transform with a composable plugin architecture.
Browse the actual product documentation and code examples included in this toolkit.
Key features of ETL Pipeline
• Multi-source extraction — JSON files, CSV files, SQLite databases, HTTP APIs • Plugin-based transforms — Chain filters, mappers, renamers, and custom transforms • Multi-target loading — Write to JSON, CSV, SQLite, or stdout • Config file support — Define complete pipelines in JSON for repeatable ETL jobs • Batch processing — Configurable batch sizes for memory-efficient large datasets • Pipeline stats — Track records extracted, transformed, loaded, and errors
Multi-source extraction — JSON files, CSV files, SQLite databases, HTTP APIs
Plugin-based transforms — Chain filters, mappers, renamers, and custom transforms
Multi-target loading — Write to JSON, CSV, SQLite, or stdout
Config file support — Define complete pipelines in JSON for repeatable ETL jobs
Batch processing — Configurable batch sizes for memory-efficient large datasets
Pipeline stats — Track records extracted, transformed, loaded, and errors
Configure ETL Pipeline parameters to see how the product works.
# CSV to JSON conversion python src/etl_pipeline.py --source data.csv --dest output.json # JSON API to SQLite python src/etl_pipeline.py --source https://api.example.com/v1/users --dest users.db --table users # Full pipeline from config python src/etl_pipeline.py --config examples/pipeline_config.