A full-stack AP automation pipeline to eliminate manual invoice processing for Mexican CFDI (Comprobante Fiscal Digital por Internet) compliance, enabling structured extraction of vendor data from multi-format files (XML, PDF, images) and centralizing invoice lifecycle management — reducing processing latency and human error for finance teams.
Architecture & Stack
Multi-service containerized architecture orchestrated via Docker Compose.
React (TypeScript) + Nginx ↓FastAPI REST API ──→ CFDIExtractionService ↓PostgreSQL (SQLAlchemy ORM) ↓Docker Compose (dev & prod) + GitHub Actions CI/CD- Backend — FastAPI (Python 3.11) with async file handling via
aiofiles - Database — PostgreSQL (SQLite fallback for dev) via SQLAlchemy 2.0 ORM
- Validation — Pydantic v2 schemas enforcing the full Mexican CFDI tax invoice structure
- Frontend — React 19 + TypeScript SPA served through Nginx
- CI/CD — GitHub Actions (lint → test → Trivy security scan → Docker build)
- Optional — Redis caching layer
Key Technical Achievements
Multi-layer CFDI validation — Enforces RFC presence (emisor/receptor), required fiscal fields (uuid, folio, fecha_emision, total), and file type/size constraints, with graceful fallbacks and structured HTTP error responses to prevent corrupt records from reaching the database.
Clean Architecture — Strict separation between Models (SQLAlchemy), Schemas (Pydantic), Services (business logic), and API routes; dependency injection via FastAPI’s Depends(get_db) pattern; 32 pytest tests (unit + integration, async-aware via pytest-asyncio).
Pydantic-validated CFDI schema — Models the full Mexican tax invoice structure: EmisorSchema, ReceptorSchema, line-item ConceptoSchema with tax breakdowns (traslados/retenciones), and digital seals — mapped to an auditable invoices table with a four-state status lifecycle (PENDING → PROCESSING → COMPLETED → FAILED).
Production resilience — Non-root Docker user (appuser), container health checks (30s interval, 3 retries), environment isolation via UV with .env / .env.production secret management.