IMTI

Architecting, Developing, SRE, DevOps, AI/ML

PostgreSQL to OpenSearch with PySpark on Kubernetes

Date-windowed ETL, idempotent upserts, and CronJob scheduling

Moving analytics data from a Zalando-managed PostgreSQL cluster into OpenSearch for full-text search and dashboarding looks simple on paper. In practice, it requires date-windowed reads over JDBC, deterministic document IDs, CronJob scheduling, and Kubernetes secret injection. This article documents the full pattern.