Also at Deasil Works · txn2 · Plexara
Profiles GitHub · X · LinkedIn
Theme Light · Auto · Dark
Professional notes by Craig Johnston
long-form, short-form, working drafts · since 2008
VOL. XIX · MMXXVI
106 NOTES IN PRINT
FOLIO LIII 2020-08-30 · 7 MIN · SHORT-FORM

Advanced Platform Development with Kubernetes

Enabling Data Management, the Internet of Things, Blockchain, and Machine Learning

Diagram · folio liii
gitGraph
   commit id: "Why Kubernetes"
   branch concepts
   commit id: "containers"
   commit id: "k8s primitives"
   commit id: "operators"
   checkout main
   merge concepts
   branch data
   commit id: "Cassandra"
   commit id: "Elasticsearch"
   commit id: "Kafka"
   checkout main
   merge data
   branch iot
   commit id: "MQTT"
   commit id: "edge nodes"
   checkout main
   merge iot
   branch ml
   commit id: "TensorFlow"
   commit id: "model serving"
   checkout main
   merge ml
   branch chain
   commit id: "Ethereum"
   commit id: "smart contracts"
   checkout main
   merge chain
   commit id: "Platform" tag: "v1.0"

I’ve been distracted for over a year now, writing a (~500 page) end-to-end tutorial on constructing data-centric platforms with Kubernetes. The book is titled “Advanced Platform Development with Kubernetes: Enabling Data Management, the Internet of Things, Blockchain, and Machine Learning


§2026 Update

I wrote this in 2020, and the book it announces has aged the way any hands-on Kubernetes book does: the architecture and the approach hold up, but a lot of the specific tooling has moved. If you work through it today, the same shifts I have been documenting across this blog apply. Ingress Nginx is retired, Elasticsearch is the one I would now swap for OpenSearch, Kafka dropped ZooKeeper for KRaft, and the Ethereum chapter predates the move to Proof of Stake. The source on GitHub still describes the platform as written; just expect to update versions and swap a few components.

The bigger thing is where this led. The book was a tour of “how to run X on Kubernetes,” ten weekend projects stitched into a data platform. That turned out to be the prototype for what I build now. The mcp-data-platform and Plexara are the same idea carried forward: a data-centric platform on Kubernetes, except the integration layer is now built for AI agents through MCP rather than hand-wired between services. The chapters on data lakes, warehouses, and routing are recognizable underneath it.

And the bet at the center of the book aged well. In 2020 I asked whether machine learning would produce something capable of replacing legacy decision systems. Six years later that is not a hypothetical, it is the dominant story in software, and the kind of platform this book describes is the substrate that AI workloads actually run on.


Original article below. Everything from here down is the 2020 announcement as originally written. The 2026 Update above covers what’s changed since.

A little more than a year ago, Apress reached out and asked if I would write a book on Kubernetes for them, mirroring the wide range of projects I develop (and write about) for my clients. I have been building data-centric platforms for almost twenty years, spanning everything from my early days on the aggregation of massive volumes of international log files for Disney to fan-driven location data for Nine Inch Nails. And in the last decade, retailers with point-of-sale, logistics, and inventory systems, marketers leveraging social media metrics, fleet operators with demanding telematics platforms, and manufacturers with advanced IIoT (industrial internet of things) networks.

Furthermore, the clients behind these verticals have begun looking for an edge, often found beyond standard PaaS offerings available today. Yet these clients from established organizations and age-old industries have little tolerance for the risk associated with a multi-year development on technologies whose value is speculative. Will blockchain revolutionize logistics and finance? Will machine learning produce artificial intelligence capable of replacing legacy decision-making systems? The safe bet is to wait. If there is gold in any of these mountains the hyper clouds will find it, and they will happily sell it to everyone, by the hour, megabyte, or IOPS.

Not every organization should be developing custom software, let alone advanced data platforms. However, the barrier to entry is lowering every day. I have been building platforms for twenty years, and never has a single technology increased productivity in this practice more than Kubernetes has in the last five. Kubernetes is truly a platform for building platforms, capable of harnessing the wide breadth of new technologies released into the open-source landscape almost daily.

On October 28, 2018, IBM announced a $34 billion deal to buy Red Hat, the company behind Red Hat Enterprise Linux (RHEL) and, more recently, Red Hat OpenShift, an enterprise Kubernetes-based application platform. What we see is $34 billion of evidence that Cloud-native and open source technologies centered on the Linux ecosystem and empowered by Kubernetes is leading disruption in enterprise software application development. IBM sells platforms, and yet it looks to capitalize on the value of a system tailored to platform development. IBM is speculating that the next trend in PaaS offerings is itself a platform for developing platforms. An organization does not spend $34 billion to on-board brand and technology without a market. Who is demanding the capabilities of OpenShift/Kubernetes? Systems, solutions, and software architects, full-stack developers, programmers, integrators, and DevOps engineers. These are the people and roles responsible for the technology that powers the business logic.

I did not want to write a book on administering Kubernetes or how it works (although you’ll likely learn this along the way). There are a ton of excellent books and tutorials on the matter. I wanted to write about what gets the lion’s share of hits on my blog: “How to run X on Kubernetes.” I don’t know why you want to run a Blockchain network on Kubernetes; I have my reasons for doing so. I don’t know why you need to interconnect Kafka, NiFi, MinIO, Hive, Keycloak, Cassandra, MySQL, Zookeeper, Mosquitto, Elasticsearch, Logstash, Kibana, Presto, OpenFaaS, Ethereum, Jupyter, MLflow, and Seldon Core. You don’t need Kubernetes to run these excellent applications. You don’t need Kubernetes to run containers networked across the globe on various cloud providers communicating over secure VPNs, on-premises, on a mix of bare metal servers and virtual machines. I know that I, and the many roles mentioned above want to, (evidenced by Kubernetes’ enormous success) and it’s likely because they are building sophisticated and modern platforms and see the value Kubernetes brings to this endeavor.

Advanced Platform Development with Kubernetes: Enabling Data Management, the Internet of Things, Blockchain, and Machine Learning” is a 500 page tutorial on the work I do daily. The examples are scaled-down yet real and fully functional. If you are an entrepreneur with dreams of building your own AWS or Azure or if constructing enterprise-capable data-centric platforms is what you do for work or hobby, my book is looking to inspire you and give you traction where I have found it myself.

§Weekend Projects

All source code and configuration manifests are open source and available online at: https://github.com/apk8s/book-source

Advanced Platform Development with Kubernetes: Enabling Data Management, the Internet of Things, Blockchain, and Machine Learning” will give you an equivalent of ten weekend projects, covering all the technology mentioned above and more, broken down as follows:

  • Week 1: DevOps Infrastructure
  • Week 2: Development Environment
  • Week 3: In-Platform CI/CD
  • Week 4: Pipeline
  • Week 5: Indexing and Analytics
  • Week 6: Data Lakes
  • Week 7: Data Warehouses
  • Week 8: Routing and Transformation
  • Week 9: Platforming Blockchain
  • Week 10: Platforming AIML

§Custom Kubernetes

Quite a few books and how-tos specialize in AKS, EKS, and GKE. “Advanced Platform Development with Kubernetes: Enabling Data Management, the Internet of Things, Blockchain, and Machine Learning” focuses on building Custom Kubernetes clusters and illustrates this by using generic (and cheap) compute instances (VMs) offered by Vultr, Digital Ocean, Linode, Hetzner and Scaleway.

§Technology

Advanced Platform Development with Kubernetes: Enabling Data Management, the Internet of Things, Blockchain, and Machine Learning” covers the following technology:


← back to all notes