Data Automation Engine is an operational data automation workflow built with Python, SQL, PostgreSQL, Docker, tests, and CI/CD. It demonstrates spreadsheet ingestion, business rule validation, data transformation, reporting, logs, error summaries, and optional PowerShell support for Windows/Azure-style operational environments without exposing private client data.
Overview
The project models a practical data workflow: receive input files, validate business rules, normalize records, generate reports, export logs, and produce an actionable summary for operational teams.
Tech Stack
- Python
- Pandas or Polars
- SQL
- PostgreSQL
- Docker
- Automated tests
- GitHub Actions
- PowerShell scripts when useful for Windows/Azure workflows
Architecture
- CLI/API entry point for operational jobs.
- Spreadsheet ingestion and schema validation.
- Rule engine for data quality checks.
- Report generation and simulated asset output.
- Error summary for support and reprocessing.
Production Practices
- CI/CD pipeline with GitHub Actions.
- Docker-based local execution.
- Relational database examples with PostgreSQL.
- Logs and error files for traceability.
- Clear separation between input, processing, output, and operational reporting.
Operational Notes
The repository is designed as a public replacement for confidential automation work: it demonstrates SQL, data automation, backend/data engineering, workflow automation, operational data, and production-ready delivery practices.