The prize of ubiquity is invisibility
I have an idea about the trajectory of Kubernetes. Instinct tells me it will both win and, simultaneously, become irrelevant to software delivery organisations.
I’ve been listening to conversations about Kubernetes (k8s) and trying to identify the recurring themes that polarise debate on whether it’s a “good” or “bad” idea. There’s sensible points of view on both sides of the debate. It seems, like most of our ambiguous collective debates, that these are parallel conversations. Moving beyond binary opposition to a “yes, and” integration is one of the greatest human skills. It’s not binary and it’s worth considering what happens if both things are true.
Everyone’s got one, so I’m looking for trends and points that resonate, even where I don’t personally agree. I’ve seen some good thinking that I nonetheless have reservations about. For example, other than the word “practical” (for me it glosses over hard problems) both these points seem valid:
In short, it’s a “universal language” and a “declarative” (rather than imperative) way to build systems. That’s interesting.
So, here are some key perspectives I’ve seen expressed.
The enterprise/architect view of Kubernetes is that a generic, common standard, across multiple clouds, plus private datacentres, means Kubernetes is “good”. I agree in theory.
This view hinges on a belief that “generic” is good
This is sometimes true, however (to my ears) the scale of effort required to actually achieve this, especially across multiple clouds, is like saying that nuclear fusion is the best source of clean energy. I agree in theory. But I don’t believe anyone has yet built a reactor that reliably and consistently generates more power than it consumes.
The view in developer teams is characterised by two beliefs.
One set of people I talk to believe that having a generic, reliable platform to deploy software to is “good”. They’re not wrong. They see the potential and it’s tantalising. But there’s a yin to this yang.
The other belief, held perhaps by those who’ve had to deal with production problems (especially ones of their own making, where there’s no one else to blame) understand that simplicity is the prime directive for workability. To quote Baz Luhrman:
The real troubles in your life are apt to be things that never crossed your worried mind; the kind that blindside you at 4 PM on some idle Tuesday.
This is close to my own instinct: that complication is intrinsically a killer — in and of itself an exponential risk to your chances of success. From this perspective, k8s is “bad” because the complicatedness will absorb more than all of your energy. Unless you have deep pockets and a dedicated platform team then time, budget and stakeholder patience will run out before meaningful value can be delivered.
The operations team
I sense the operations view might be the most grounded. After all, these are the people who tend to be up at stupid o’clock, dealing with the fallout of the cans that architecture and delivery teams kicked down the road under pressure from senior stakeholders. The buck stops at operations. It’s rarely of their making and there’s often too little of an empathy feedback loop to achieve workability.
In that situation, a generic platform that maintains healthy separation of workloads from infrastructure is “good” because it creates a clearer separation of root causes and helps to push back. Standardising the way we package, run and monitor workloads is pain-relief. Simultaneously, there’s an acknowledgement that complicated systems are “bad”: they’re a recurring nightmare to keep going and, critically, create nebulous, multi-layered nests of unclarity that can comfortably obscure thundering security risks for undefined periods.
When a cluster is working, it feels like magic.
The problem is understandability when a cluster isn’t behaving as expected. Being able to comprehend it is like reading The Matrix code and seeing “the woman in the red dress”. A swirling maelstrom of intricate, verbose, interlaced yaml that drives an Alice-in-Wonderland-like rabbit hole of of master and worker control and data plane behaviours. Sure it’s declarative but it can feel like a riddle, wrapped in a mystery, inside an enigma.
It’s clear that Kubernetes is big. It’s both complex and complicated. That’s one thing everyone agrees on. If your team can understand and manage that, then it’s probably going to be “good” for you. Using GKE or EKS means you’ll be able to externalise a proportion, but a rump of the cognitive load will remain in your court. There’s still a lot of yak to shave.
If you’ve got a full time platform team of a dozen people dedicated to running Kubernetes you’ll do pretty well. But here’s the thing: running generic platforms and services adds no specific value. It’s an externality and as such ultimately will be externalised. We know this because that’s exactly what cloud is: externalising the hard problems of running reliable, fault-tolerant generic infrastructure.
Infrastructure is the endangered species of software delivery organisations. Where once there were racks of computers in locked rooms with impressive and mysterious blinking lights and lots of whirring fans, now a co-working space and a laptop are all you need to conduct an orchestra of thousands.
As the need for IT infrastructures has balooned, so has it disappeared from our places of work.
That very need is what has driven externalisation. Building infrastructure was too hard, too slow and too complicated. Constrained by the basic physics of office and data centre space and the mechanics of buying, racking, networking and tending to machines whilst handling failures with grace.
And this is why I think Kubernetes will disappear. It’s so generic that there’s no reason to do it yourself. Few organisations operate on a scale where it makes sense to run datacentres. The practical friction of running Kubernetes creates a similar dynamic. Like reliable infrastructure, it’s too hard, slow or expensive to justify doing it really well yourself, but there probably is value in paying for that as a service from a cloud provider.
In the end, precisely because it’s generic and because running a deployment platform is an undifferentiated hard problem, it can and will be commoditised. Fargate and Cloud Run (Knative) are already barrelling down this road. If you remember what Maven did for the Java world, you’ll understand that accepting a little opinionation delivers a lot of productivity.
There will always be exceptions, but I think they’ll prove the rule: that if Kubernetes manages to conquer the mainstream then for the majority of software delivery organisations it will quietly slip below the waterline of commodity.
In the end technology is about solving practical problems in service of worthwhile missions. Workability is Occam’s Razor for technical architecture.