A collection of data engineering pipelines, machine learning systems, and full-stack applications I’ve built and documented.
Each project prioritizes production-grade engineering: reproducible environments, modular codebases, and real data problems. Work spans MLOps workflows, ELT pipelines, data warehousing, and full-stack automation.
Mexico Air Quality Monitoring Pipeline
Production-grade ELT pipeline centralizing air quality data from 300+ monitoring stations across 121 Mexican cities into a BigQuery analytical warehouse with a Looker Studio dashboard.
Airline Passenger Satisfaction ML
End-to-end MLOps pipeline predicting passenger satisfaction in real time using XGBoost, enabling airline operations teams to trigger proactive customer service interventions.
Accounts Payable Automation System
Full-stack AP automation pipeline for Mexican CFDI invoice processing, with structured extraction from XML/PDF/image files and a React frontend for invoice lifecycle management.
NYC Taxi ELT Pipeline
ELT pipeline centralizing NYC taxi trip data from a paginated REST API into DuckDB, with interactive analytical reporting on fare patterns, tip behavior, and temporal demand.
Open Library DLT Pipeline
ELT pipeline centralizing bibliographic data from the Open Library public API into DuckDB, with an Ibis query layer and Altair visualizations in a Marimo interactive notebook.
Media Hit Prediction ML
End-to-end MLOps pipeline predicting whether a film will be a commercial and critical hit, enabling data-driven decisions on marketing budgets and production resource allocation.