Every platform needs a database, and for most of what you build it should be PostgreSQL. The managed versions, RDS and Cloud SQL and the rest, are convenient, and they bill you for that convenience while tying your data to one provider’s backup format and failover behavior. I can run it on the storage I just set up instead: a highly available Postgres that fails over on its own, backs itself up, and runs entirely on infrastructure I own. The operator that makes this boring is CloudNativePG.
This series rebuilds my 2020 Apress book, Advanced Platform Development with Kubernetes, for 2026. The approach behind it comes from building and running data platforms in production for more than twenty years.
§Why Postgres, and Why an Operator
The original book used MySQL for its relational needs. I replaced MySQL with Postgres years ago and have not looked back. Postgres is my default RDBMS, and it is flexible enough that it quietly absorbs jobs you might otherwise stand up a whole separate system for. The pgvector extension makes it a capable vector database when a solution does not warrant a full Milvus. Its LISTEN/NOTIFY gives you a real job queue when you do not need a full Kafka, which I have written about. Reaching for Postgres first means one database I know well covers a lot of ground.
It does not cover everything, and this platform does not pretend it does. Cassandra is still the answer when you need extreme write throughput with bulletproof high availability, and OpenSearch when you need a strong search index, especially for aggregations across large datasets. Both show up later in the series for those jobs. Postgres is where you start.
Running a database on Kubernetes used to make people nervous, and fairly so. Databases are stateful, with strong opinions about disk, identity, and the order in which things start, and early Kubernetes was built for the stateless opposite. That objection is the reason the 2020 book ran its relational database by hand. It is also obsolete. CloudNativePG is a CNCF operator that treats a Postgres cluster as a first-class Kubernetes resource and handles the hard parts: streaming replication, automated failover with leader election, rolling minor-version upgrades, connection pooling, and continuous backup with point-in-time recovery. You describe the cluster you want in one manifest, and the operator runs it the way a good DBA would. This is the pattern the series leans on, an expert encoded as an operator, with an agent on top of that. The thing managed RDS sells you, someone competent operating the database, is now software you run yourself.
§Install the Operator
CloudNativePG installs from a single release manifest. Pin a current version; this uses 1.25, so check for the latest first. The --server-side apply handles the large CRDs cleanly.
kubectl apply --server-side -f \
https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.25/releases/cnpg-1.25.0.yaml
It installs into the cnpg-system namespace. Wait for the controller before continuing.
kubectl -n cnpg-system rollout status deployment/cnpg-controller-manager
While you are at it, install the cnpg kubectl plugin, which is the idiomatic way to operate these clusters. It is a krew plugin (kubectl krew install cnpg) and it turns common operations into one-liners you will use constantly.
§Declare a Cluster
Give the platform its own namespace, then declare the database. The minimal form is three lines of spec, but a database is worth describing properly, so this manifest sets the things you actually want in production: three instances spread across nodes, real resource requests, a couple of tuned parameters, and supervised upgrades so a primary switchover only happens when you say so.
kubectl create namespace data
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: platform-pg
namespace: data
spec:
instances: 3 # one primary, two streaming replicas
imageName: ghcr.io/cloudnative-pg/postgresql:17.2
primaryUpdateStrategy: supervised # you approve the switchover on upgrades
storage:
size: 20Gi
storageClass: rook-ceph-block # the Ceph storage from the last post
# keep the replicas on different nodes, so losing a node loses one instance
affinity:
enablePodAntiAffinity: true
topologyKey: kubernetes.io/hostname
resources:
requests:
cpu: "500m"
memory: 1Gi
limits:
memory: 1Gi
postgresql:
parameters:
max_connections: "200"
shared_buffers: "256MB"
monitoring:
enablePodMonitor: true # exposes metrics for Prometheus
kubectl apply -f platform-pg.yaml
The operator provisions three Postgres pods on Ceph-backed volumes, places them on different nodes, initializes the primary, brings the replicas into streaming replication, and stands up the services that route to them. By default it bootstraps an application database named app with its own user and writes those credentials into a Kubernetes secret. It does not create a superuser by default, which is the safe choice; you enable enableSuperuserAccess only when you have a reason to.
§Watch It Come Up
kubectl -n data get cluster platform-pg
NAME AGE INSTANCES READY STATUS PRIMARY
platform-pg 3m 3 3 Cluster in healthy state platform-pg-1
Three instances, three ready, a healthy cluster, and a named primary. The operator also created three services, and the names tell you what they are for:
kubectl -n data get svc -l cnpg.io/cluster=platform-pg
NAME TYPE CLUSTER-IP PORT(S) AGE
platform-pg-r ClusterIP 10.96.10.11 5432/TCP 3m
platform-pg-ro ClusterIP 10.96.10.12 5432/TCP 3m
platform-pg-rw ClusterIP 10.96.10.13 5432/TCP 3m
platform-pg-rw always points at the current primary; this is the one applications write to. platform-pg-ro load-balances across the read-only replicas, for read-heavy traffic you want to spread. platform-pg-r reaches any instance. The application connects to a name, and the operator keeps that name pointed at the right pod no matter what happens underneath, which is what makes failover invisible to the app.
§Connect, and From Your Laptop
The application credentials live in a secret the operator generated, including a ready connection string.
kubectl -n data get secret platform-pg-app \
-o jsonpath='{.data.uri}' | base64 -d
postgresql://app:<password>@platform-pg-rw.data:5432/app
For a quick session as an operator, the cnpg plugin opens psql straight into the primary:
kubectl cnpg psql platform-pg -n data
app=# SELECT version();
version
-----------------------------------------------------------
PostgreSQL 17.2 on x86_64-pc-linux-gnu, compiled by gcc...
For developing against it from your workstation, reach the service by name with kubefwd, the same tool the rest of the series uses. One command forwards the data namespace, and platform-pg-rw resolves on your laptop exactly as it does in the cluster.
sudo kubefwd svc -n data
# in another terminal, your local code connects as if it were in-cluster
psql "postgresql://app:<password>@platform-pg-rw:5432/app"
§Pool the Connections
Postgres handles a bounded number of connections well and falls over when something opens thousands of them, which is exactly what a fleet of web pods or serverless functions does. The answer is a connection pooler in front of the database, and CloudNativePG runs one as another declarative resource. A Pooler stands up PgBouncer pointed at your cluster, and apps connect to the pooler instead of the database directly.
apiVersion: postgresql.cnpg.io/v1
kind: Pooler
metadata:
name: platform-pg-pooler
namespace: data
spec:
cluster:
name: platform-pg
instances: 2
type: rw
pgbouncer:
poolMode: transaction
parameters:
max_client_conn: "1000"
default_pool_size: "25"
Now a thousand client connections collapse onto a small pool of real database connections, in transaction mode, and the database stays healthy under load it would otherwise refuse. This is one more thing managed Postgres charges for that is a single manifest here.
§Operating the Database
A managed database service silently does four things you are paying for: it fails over when the primary dies, it backs up continuously so you can recover to any second, it runs your version upgrades without losing data, and it shows you metrics. Own the database and those are yours, and CloudNativePG does each of them as a matter of course.
§Automatic Failover
This is the reason to run a database under an operator rather than a hand-rolled StatefulSet. Delete the primary pod and watch.
kubectl -n data delete pod platform-pg-1
kubectl -n data get cluster platform-pg -w
The operator detects the loss, promotes the most up-to-date replica to primary, and repoints the platform-pg-rw service at it, in seconds, with no action from you. The application, connected to the service name, never knew which pod was primary and does not need to. When the old primary comes back it rejoins as a replica and catches up. That failover logic, the part most hand-rolled setups get wrong, is the operator’s job, and it is tested better than anything you would write under deadline.
§Continuous Backup and Point-in-Time Recovery
A database without backups is a liability, and snapshots alone are not enough; you want to recover to the moment before a bad migration, not just to last night. CloudNativePG does continuous backup to S3-compatible object storage through Barman, archiving WAL alongside periodic base backups, which together let you restore to any point in time. You add a backup stanza to the cluster pointing at a bucket, then schedule it.
spec:
backup:
barmanObjectStore:
destinationPath: s3://platform-backups/postgres
endpointURL: http://seaweedfs.data:8333
s3Credentials:
accessKeyId:
name: backup-creds
key: ACCESS_KEY_ID
secretAccessKey:
name: backup-creds
key: ACCESS_SECRET_KEY
retentionPolicy: "30d"
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
name: platform-pg-nightly
namespace: data
spec:
schedule: "0 0 2 * * *" # 02:00 daily
cluster:
name: platform-pg
Recovery is its own manifest: a new Cluster that bootstraps from the backup with a recoveryTarget, so restoring to “just before 14:32 today” is declarative rather than frantic. The object storage these point at is SeaweedFS, which the platform stands up in a later post; the hook is built in now and costs nothing until you aim it at a bucket. This is the storage-snapshot idea from the Rook post taken to the database layer, so the platform owns its recovery along with its state.
§Rolling Upgrades
You move Postgres to a new minor version by changing one field, imageName, and applying it. The operator upgrades the replicas first, then, because you set primaryUpdateStrategy: supervised, waits for you to approve the switchover before promoting an upgraded replica and retiring the old primary. It is the one-step-at-a-time upgrade discipline from the cluster post, applied to the database, with the operator doing the work and you holding the final approval.
§Metrics
Because the cluster set enablePodMonitor: true, CloudNativePG exposes Prometheus metrics for connections, replication lag, transaction rates, and storage, and the monitoring stack later in this series scrapes them with no extra wiring. The cluster is visible by default, which is what you need to operate it.
§Put It to Work
Open a session and create something to confirm it does real work.
CREATE TABLE signups (id bigserial PRIMARY KEY, email text, created timestamptz DEFAULT now());
INSERT INTO signups (email) VALUES ('[email protected]');
SELECT * FROM signups;
id | email | created
----+------------------+-------------------------------
1 | [email protected] | 2026-06-27 14:30:11.482+00
That row is now replicated across the cluster and will survive a failover. The extra jobs are one statement away: CREATE EXTENSION vector; turns this same database into a vector store for embeddings, and LISTEN/NOTIFY turns it into a job queue, both without standing up another system.
§When Something Is Wrong
The cluster never reaches READY and pods are Pending. The PVCs cannot bind, which points back at Rook; check kubectl -n data get pvc and confirm rook-ceph-block is healthy. A database cannot start without its disk.
Applications get “too many connections.” You are opening more connections than max_connections, usually from a pod fleet. Put the Pooler in front and connect through it; raising max_connections instead just moves the cliff.
Writes fail after a failover, or you see read-only errors. The application is connected to platform-pg-ro or platform-pg-r instead of platform-pg-rw. Only the -rw service points at the primary; writes go there.
A replica lags or will not catch up. Check replication status with kubectl cnpg status platform-pg -n data, which shows each instance’s role and lag. Persistent lag is usually disk or network pressure on that node, the kind of thing the metrics make obvious.
§What You Have
A production-grade PostgreSQL cluster: three replicas spread across nodes, automatic failover, pooled read and write endpoints, continuous backup with point-in-time recovery, metrics, and credentials, all on storage you own, declared in a single manifest and operated for you. It is the equivalent of managed Postgres without the bill or the lock-in, and it is the data backbone much of the rest of this series plugs into.
Next I add the other half of a data platform’s intake, real-time streaming with Kafka, running in KRaft mode with the ZooKeeper the old book required finally gone.