Kubernetes Podcast from Google

By Kubernetes Podcast from Google

ADAM GLICK: [CHUCKLES] Ba-doom-tish.

CRAIG BOX: Over the weekend was the 50th anniversary of the moon landing. And I must say, it was interesting to think about the change of technology, given that I was able to, while I was out, stream live video to my phone and watch what was 50 years old at that time -- the replay of the CBS News coverage of that day.

ADAM GLICK: It was a pretty dramatic piece of footage. I think you forwarded it on to me when you were looking at it. And it was fun to watch what TV was like back then.

CRAIG BOX: Yeah. And so obviously, just digging around looking at what was in the news, I got to the very interesting story of a guy who in the '70s had bought a couple of tons of scrap from what turned out to be NASA. And he owns, now, one of the original Apollo guidance computers. So these are the computers that were flown on the command module and the lunar module. This one was used for testing beforehand, so it's not actually space-flown.

But it turns out that that's good because the ones that they put in space, they took all the wiring and basically covered it with epoxy. And the ones that they left on Earth, they didn't do that. So this one had the chance of being repaired. There was a hugely interesting documentary series by four people who have taken this on as a project. And they've actually repaired this machine to full working order. It's hard to describe exactly what this is in terms that make sense today.

But as it's explained, it's basically the predecessor of almost all computing that exists. It is an octal-based machine. It was the first computer to use integrated circuits. I think it used something like 75% of the world's supply of ICs in the '60s or something. It must have cost countless millions of dollars to build. And it had about the same power as a personal computer you might have bought at the end of the '70s, so your Commodore PET or your Apple II, or something like that.

It's an amazing story. There's hours and hours of documentary footage. There's a couple of summaries that I might link to in the show notes that summarize it in a few minutes, but really interesting story.

ADAM GLICK: Wow. That's great. You took me back there with the remembering electronics that were encased in epoxy. I remember I had an old computer that had the same thing, and the power supply died on it. I was like, oh, I can open this up and take a look at it. You're expecting to see some sort of Wheatstone bridge or whatever. And you open it up, it's just a block of epoxy. You realize, nope.

CRAIG BOX: It gave it that nice heft, though. You want to pick this thing up and say, yeah, this feels like it's solid. It's made of good stuff. Turns out it's just full of epoxy.

ADAM GLICK: [CHUCKLES] True enough.

CRAIG BOX: One of the things about the machinery of that day is it had to be reliable in every situation. And it's interesting to see as time has gone by how the circumstances that you need to protect your system against change. And you sent me a link this week which was very interesting.

ADAM GLICK: Yes, so I discovered this website, which is just absolutely brilliant, called Cyber Squirrel 1, which is, if you ever wanted to know what is causing various power outages around the world, it will show you where the power outages are and what caused them. And it'll let you filter by the cause.

CRAIG BOX: Ooh, was it hackers?

ADAM GLICK: The most brilliant one is-- one of the categories is squirrel or not squirrel. Because it turns out, almost half of the power outages are just caused by squirrels wandering into power stations and getting very, very unlucky.

CRAIG BOX: So, not hackers.

ADAM GLICK: No, not hackers. In this case, there was a power station that had to be shut down because of what was listed as a jellyfish attack. And the number of jellyfish had literally flooded the waters, and therefore had been pulled into the cooling system and clogged the cooling system so they could no longer run the power station. And it reminds you that when you're thinking about what could go wrong and what caused it, there are just things that you are probably not thinking about. And that's why designing for high availability is always good.

CRAIG BOX: And with that, let's get to the news.

[MUSIC PLAYING]

Traditional enterprise vendors continue to embrace Kubernetes and Cloud Native, with two big names making announcements last week. First, IBM, who officially launched Kabanero, with a K. Kabanero is designed to offer a development and administrative experience on top of the Kubernetes, Istio, and Knative stack, but with a flavor that will be familiar to enterprise developers.

Administrators can define application stacks, a number of which are provided by another new project called Appsody, the first being Java, Node.js, and Swift. You can then build your service locally with IDE integration for VS Code, IntelliJ, and Eclipse, using integrations from a third new project called Codewind.

ADAM GLICK: The other big enterprise announcement was from Pivotal, who announced an alpha of Pivotal Application Service, P-A-S, running on Kubernetes. PAS is a PaaS, P-A-A-S, based on the popular Cloud Foundry runtime for 12-factor apps. They also announced some other tools designed to make running Kubernetes easier, including a build service amd RabbitMQ on Kubernetes. Pivotal made the surprise move of pre-announcing PAS on Kubernetes on their earnings call last month after their stock dropped almost 50%. Pivotal's CEO cited a complex technology landscape lengthening their sales cycle and suggested that Kubernetes product updates will grow their addressable market.

CRAIG BOX: Weaveworks has donated their Flux continuous delivery tool to the CNCF as the latest project to enter the CNCF sandbox. Flux is an operator that runs in your Kubernetes cluster and ensures the state of objects in that cluster match a configuration defined in Git. Since its creation three years ago this month, it has grown to support Helm deployments and canaries with Istio, amongst other features.

ADAM GLICK: This week is the first Windows Container Unconference to be held at Microsoft in Redmond, Washington. The meeting aims to connect the community around emerging use cases for running Windows workloads on Kubernetes. Sign-ups are still open. If you have any questions for the community, there's a doc that you can add to, and it's linked in the show notes.

CRAIG BOX: Google Cloud announced the Spinnaker for GCP solution, which lets you install Spinnaker with a couple of clicks and start creating pipelines for continuous delivery. It integrates with other Google tools, like Cloud Build for creating Docker containers, Container Registry for vulnerability scanning, and Binary Authorization to only deploy trusted container images to a cluster.

ADAM GLICK: Linkerd recently launched version 2.4. New in this version is traffic splitting, support for Service Mesh Interface, and production-ready high availability.

CRAIG BOX: Last week, we mentioned a new GKE course. And our friends in Google Cloud reached out with an offer to all of our listeners. You can study all four modules in the Architecting with Google Kubernetes Engine course for one month and get certified, all for free. Find the special link just for you in our show notes.

ADAM GLICK: Brian Goff from Microsoft Azure posted a deep dive into the Virtual Kubelet project, which went 1.0 recently. Virtual Kubelet is a CNCF project that allows providers to build services that act like a node in a cluster, just like a Kubelet does, but are instead backed by a container-as-a-service implementation. The service has recently changed from a single runnable container with support for all providers to a library which lets each provider build their own Virtual Kubelet for their own service. And Brian talks you through how that was done.

CRAIG BOX: A new Kubernetes Special Interest Group, or SIG, has formed. SIG Usability is looking for help to make Kubernetes more accessible and easier to use and will look at projects in areas including internationalization and tooling development. If you want to get involved, links to their Google group and Slack are in the show notes.

ADAM GLICK: In other SIG news, as cloud providers have moved out of the main Kubernetes tree, so have their SIGs. Groups dedicated to providers like GCP, AWS, IBM, and VMware have moved from being top-level SIGs to being sub-projects of SIG Cloud Provider.

CRAIG BOX: Microsoft's Azure Monitor for containers has added support for scraping Prometheus metrics as a preview feature. This means you can now monitor your AKS stack from top to bottom using the same tool.

ADAM GLICK: Finally, Kubernetes has announced that they are removing some deprecated versions of common APIs in the upcoming 1.16 release. The APIs includes some of the most commonly used objects, including Network and PodSecurityPolicy, Ingress, and the application controllers including Deployments and StatefulSets.

These APIs have received newer versions. And if you haven't yet updated your YAML files, you might be depending on old versions. The newer APIs have been available for several releases, so make sure you update your resource definitions, custom integrations or controllers, and any third-party tools before upgrading to 1.16 in September.

CRAIG BOX: And that's the news.

[MUSIC PLAYING]

CRAIG BOX: Owen Rogers is a Research Vice President co-leading the cloud team at 451 Research and is the architect of the Cloud Price Index. Owen has previously held product management positions at Cable & Wireless and Claranet, and has developed a number of hosting and cloud services. He is a Chartered Engineer, a Member of the British Computer Society, and the Institute of Analyst Relations' Innovative Analyst of the Year. In 2013, he completed his PhD thesis on the economics of cloud computing at the University of Bristol. Welcome to the show, Owen.

OWEN ROGERS: My pleasure to be here. Good to be on.

CRAIG BOX: Many of our listeners are the hands-on keyboard technical type. And this might be the first time they've heard from an analyst. What do analyst firms do?

OWEN ROGERS: Analyst firms really represent the intersection between the buyer and the supplier side. So let's say you're selling a cloud service and you want to know what buyers want. But you obviously can't ask every buyer out there in the market. And talking to your own customers would obviously give you a distorted view of reality. So analysts really provide information on what customers want. And we collect this information by talking to enterprises, by survey, and by engaging with enterprises at events and things like that.

And if you're a buyer, you might want to know what your options are, what technology would be best for your particular use case. And in a similar way, we talk to service providers and vendors to understand their offerings. And we can provide advice and recommendations to those buyers so they can make a better, more informed decision.

ADAM GLICK: Like many of us, you have a technical background as well. How did you get into cloud economics?

OWEN ROGERS: I was a product manager in London for a hosting firm. And to be honest with you, my girlfriend dumped me. And I decided to start again. So I saw an advert for a funded PhD, and I got talking to the supervisor, which was Professor Dave Cliff. And he essentially was a specializer in algorithms to do financial trading. So I suggested to him we could use those algorithms or develop some capability to trade using cloud resources. And that's how my PhD thesis came about, and pretty much how I got into this field of cloud economics.

ADAM GLICK: Proof that every cloud has a silver lining.

OWEN ROGERS: It does. And in fact, at the University of Bristol, then I met my lovely wife and had kids and life moved on. So it was definitely a good decision.

CRAIG BOX: How would you explain the value proposition Kubernetes provides to someone who wasn't familiar with it?

OWEN ROGERS: Cloud economics is obviously understanding about supply and demand. But in particular, I'm interested in two aspects, which are cost and value. And they are highly related, but different. The value of something depends on how much something costs and also how much of a benefit you get from it. So why do I look at the economics of Kubernetes by focusing on the cost element first?

So let's say you were going on vacation and you packed your suitcase. You've bought some new clothes, which are in fixed-size boxes straight from the store. And rather than unpacking the boxes, you pop them straight into the suitcase. There's lots of spare space in the boxes and in the suitcase, but you can't use this space because everything is in a fixed-size box.

Now, to take your suitcase on the plane would cost, let's say, $100. But as you failed to pack in as much because you've used these fixed-size boxes, you need to take two suitcases at a total cost of $200. Now each of these fixed boxes is essentially a virtual machine. Each virtual machine requires significant overhead in the form of an operating system. And that's why you're getting all this waste and you have to take more suitcases because you can't put as much in one.

Now imagine you open these boxes from the store and you just take out all the clothes and squeeze them wherever they fit in every nook and cranny so that now you only need to take one suitcase. And that would cost you only $100. Essentially, you've lowered your unit cost per item of clothing just by packing better.

Now, this concept represents a container. With less overhead due to less duplication, you can cram more in. And in theory, an application built using containers should cost less than using virtual machines, as long as you're packing those containers into the suitcase of the server or the virtual machine in the correct way. Now saving money is, of course, valuable. But I don't think that's the dominant reasons people choose them or choose to use Kubernetes.

CRAIG BOX: What if instead you were to hold each item of clothing in your hand and then decide if it brang you joy before packing it?

OWEN ROGERS: Well, I would suggest wearing every item of clothing you own so you can get on the plane making sure you're nice and cozy and then have more to put in your luggage.

ADAM GLICK: Funny aside on that-- I was on a vacation. And we ran into that where they weighed our luggage and we were something like half a pound over what they would allow. And they wouldn't let us on the plane because our luggage was overweight. So we literally stood in front of the counter, unzipped the bag, and then put on the extra clothing until we were layered up enough to have taken a half a pound of clothing out of the bag, at which point then the bag was considered, OK, it was a carry-on bag. And then we took it on the plane, promptly took the clothing off, and put it back into the bag.

OWEN ROGERS: How many layers did you have on?

ADAM GLICK: By the end of it, I think I ended up with three layers of clothing.

OWEN ROGERS: Half a pound of clothing is quite a lot.

CRAIG BOX: And you can lose half a pound of body weight through sweat.

[LAUGHTER]

CRAIG BOX: So what you said before is very much what we got out of systems like Kubernetes at Google we were using before we open sourced them. We thought that the value that people would see is you can pack more into fewer machines. But that's not actually the value that we think most people got out of the system. What is it that you've seen people getting value out of Kubernetes above and beyond the container optimization?

OWEN ROGERS: I think with the advent of greater technology and more innovation and a greater range of cloud services, we are starting to see enterprises and developers and ops and everybody else suffer from something which I would call cloud entropy. So you make your beautiful application. It's fantastically architected. But over time, it becomes more disorganized. Virtual machines are left on. Containers are orphaned. Patches aren't applied uniformly. Versions become out of date.

And once something that was pristine and lovely slowly starts to become disordered, a bit like my daughter's bedroom. You tidy it up, and then a few hours later, it's in complete disarray. And this is happening because the application evolves. People are tweaking to it, adding to it, more capability is added. And obviously, this is what should be happening. Any application that is set in the way it was done 10 years ago isn't going to be relevant anymore. Developers want it to be secure, performant, well-designed, and attractive. It's what the users want. So it should be doing that.

But evolution naturally brings complexity. Now look at a single cell amoeba compared to humans. Developers want to use the best packages and tools and services so that their applications become better. And for me, the value in Kubernetes is enabling in this complexity to take place while ensuring it doesn't get out of hand. So allowing applications to scale, to be developed, to evolve in the correct manner across venues such that they don't become so complex that the entropy means they start failing in terms of performance, of availability, of security.

So for me, Kubernetes means companies and developers can fully grab hold of this complexity and all the range of options they have available to them while knowing that there is something keeping track of all this complexity so that you can make sure that performance, availability, security, scalability, and all these great characteristics stay in place.

ADAM GLICK: Can you put a dollar figure on that?

OWEN ROGERS: Quantifying this value is difficult. And I get asked these types of questions all the time. And I have been thinking about it, but the challenge is that there is no upper limit to how great the saving or the benefit can be. Now in the cost-saving elements, you can quantify the value relatively easier, as there's an advantage compared to previously.

So let's say a company started using Kubernetes and they saved $100,000 compared to the way they did it before. That's very easy to quantify. But it's very difficult to understand business value, because there's no way of knowing what the situation was like before Kubernetes. Let's say, for example, you have a cloud application and it's running via Kubernetes Engine. And because of this, it demonstrates reliability, which wins a $10 billion contract for the company.

Now, you could argue the value of Kubernetes there was winning that $10 billion. So it's very difficult to come up with a definitive measure of how valuable something like Kubernetes is.

CRAIG BOX: Kubernetes a toolkit that allows you to scale systems horizontally and to do distributed systems very easily. But there is equally an argument to be made for just buying the biggest machine that the workload can use. And especially today with multicore CPUs and RAM being very cheap, it is possible to spec out an individual machine such that you don't actually have to scale your workload across more than one, except in the case of high availability.

Stack Overflow were an early example of this. They had a lot of posts talking about how they would just build everything on the biggest possible box there is and use what was traditional monolithic software, things like Microsoft SQL Server. How do you factor in the cost differential for scaling out versus scaling up?

OWEN ROGERS: I'm not a huge fan of buying the biggest virtual machine you can and just having the monolithic approach. The challenge is that-- well, we can go back to the suitcase analogy. Let's say you buy the biggest suitcase you could possibly want to buy. And slowly you start filling it up with everything you want to take on your vacation.

Now, if you only take one pair of socks, you've bought this huge suitcase, and you only need a tiny part of it. So essentially, you're paying that $100 just to take a pair of socks because you've paid for the whole suitcase. And I think the same thing can happen with virtual machines, containers, or any other cloud resource. You can buy this huge, massive workload but actually only end up consuming a very small amount of it. And all of that investment you've made is a sunk cost, a sunk cost you can never get back.

Now if you were to distribute that, it would be a lot better value because you only need to consume exactly the small amount you need. And for me, this is why I think serverless is so interesting, some functions of the service. Because you can spin up those functions of a service for fractions of a second, consume a tiny bit of memory, and as soon as it's done, it's closed down and it disappears. So there's no need to invest in any long term or decide the sizes in advance. You can just take what you need when you want it.

So for me, it's generally always better to scale in a distributed fashion than to buy a very large instance. Although it obviously all depends on if it's a legacy application and if the application benefits by being in one place. So for example, if you're determined to run your own in-memory database, it might make a lot of sense to put it on a single VM.

CRAIG BOX: If I could push the clothing analogy a little further, though, what do you think about the case where, let's say that you are charged a flat fee per item of clothing? And maybe this is a case like maybe your dry cleaner charges you a fee per item of clothing. So you can dress your baby in a onesie, which covers them head to toe, or you can have an individual shirt and skirt and hats and socks and so on. And those are individual items, and it will cost more overall. Do you think that there is a trade-off to be made there in terms of fixed cost versus elastic cost?

OWEN ROGERS: Yes. So essentially you're talking about a bundled offering versus a distinct offering. And there's really a break-even point between the two. Now let's say that you could take a whole bundle of your little baby's clothing for $50, or you could charge $5 for taking each individual item. Well, there'd be a point where it just wouldn't be worth paying for each individual item because the bundle would be better value.

And I think that is one of the cost considerations that needs to be made. Is it better to pay a fixed cost in this particular use case, or is it better per pay for each individual item? And there's a lot of uncertainty in that calculation because you might not know how much you're going to need to use in 6 months or 12 months time. And for me, that is part of the benefit of cloud in general, in that you don't need to make those predictions for the future. You can just scale up and down as required.

ADAM GLICK: I'm wondering if any of the economics in these scenarios change depending if someone is running in an on-premises data center versus running in the cloud. For instance, historically, people tried to buy the biggest box they could, especially for something like a database, because it only scaled vertically. And so you had to scale up.

CRAIG BOX: And there was a fixed one-off cost in installing it.

ADAM GLICK: Plus there was the amount of space that it took, that if you had a really powerful box, and let's say that was 2U or maybe a 4U box, that was how much space it would take, versus if you were going to buy independent smaller machines, you're talking about rack space. You're talking about physical space, heating and cooling. There's a lot of other pieces. In the cloud, people can spin up whatever they want. Do the economics change as you're talking to your customers about how they think about it depending on where they're deploying their Kubernetes workloads?

OWEN ROGERS: Yes. So I did a huge Monte Carlo simulation. So I wrote a project called the Cloud Price Index. And we collected pricing from public cloud providers and private cloud providers. And I wanted to understand where the break-even point difference was between the two. So when would on-prem be cheaper than public cloud?

And to do this, I didn't want to pollute our actual data on pricing with unfair assumptions on labor efficiency and utilization. So I built a Python simulation that went through every possible combination of labor efficiency and utilization. So labor efficiency essentially measures how well managed is the on-prem deployments. And utilization measures how well used is the on-prem deployments.

And we found a break-even point where on-prem would be lower cost. But what essentially has to happen is that on-prem workload, on-prem deployment has to have a much greater level of utilization, typically above 60%, but also has to have a higher level of labor efficiency, typically around-- I think it was 30%.

So the economics changed just because in public cloud, you don't need to necessarily think about the utilization because you can just consume what you want to use when you use it. But in private cloud, that is a major concern because you have to buy that fixed capacity usually at day one.

Now there are some on-prem procurement models which mean you can buy more capacity as and when you need it so you can achieve a higher level of utilization. But my recommendation would really be is to do your sums and look at how you predict your future usage will be and to do some what-if analysis. Don't be ashamed to get Excel or the Google equivalent up on your screen and start playing around to see what would happen to your cost base if you were unexpectedly going to spike in demand or if things weren't going to go as well as you planned.

CRAIG BOX: A lot of companies say that they get to a scale where they believe it becomes cheaper to run their own infrastructure. I think Dropbox is an example of that. They've scaled up to the point where they say, I'm paying so much to a vendor. I'm going to bring that all back and I'm going to take that labor cost and bring it on myself. Do you think that's something that many companies will consider? Or will it be only for people who are dealing truly at that scale of data?

OWEN ROGERS: In terms of virtual machines and perhaps the containers and the microservices or anything built upon those virtual machines, I think if you're operating it very efficiently and you have tools and automation in place and you know what you're doing, then, yeah, it might be cheaper to run that infrastructure on your own premises compared to using the public cloud. And that's especially true if you have a huge deployment and you're very good at maximizing the estate.

However, I think what people forget is the cloud is so much more now than just virtual machines and computes. So even if you were able to do that and maximize your efficiency such that your on-prem is cheaper, how about if you want to run AI in the cloud or analytics or IoT or all these other technologies which cloud providers are bringing to market? It's completely feasible that you might be able to hire infrastructure experts to make infrastructure costs lower. But how about experts in all these other services which are now offered by cloud providers, from databases to whatever else you can imagine?

So for me, I would never say either or. I think realistically, users of cloud need to be open minded about where they're going to host their workloads and what services they're going to use to do them.

ADAM GLICK: You mentioned labor efficiency there. I wonder if you can talk about some of the other non-direct costs that people pay and how they impact the economics of Kubernetes and how you measure them.

OWEN ROGERS: In terms of non-direct costs, it's interesting. We talk about this term, "digital transformation" and this idea that companies are super advanced and they know exactly how much each department is spending and how much each employee is making, and they use this data to do incredible things. But I think most of the time, companies are still very much a bit of a disorganized nightmare in terms of who is paying for what.

Now if we look at a typical data center, if you're running an on-prem application, you probably know how much that server costs. You might have an idea of perhaps even the bandwidth costs associated with that. What you probably don't know is how much the company is paying from a centralized perspective for that data center, for the power, for the space, for the hands-on support and these type of things.

And for me, this is the ongoing challenge of on-prem deployments, is it's so difficult to have a real view of all your costs, that it's difficult to do a comparison at all.

ADAM GLICK: What do you think is the most misunderstood thing about the economics of Kubernetes?

OWEN ROGERS: For me, the most misunderstood issue is about complexity. And I'm hot on this complexity topic at the moment. In fact, my PhD was part of a research group called the Large-scale Complex IT systems group. And essentially, the group investigated how complexity was born out of IT and how this complexity would be managed. And I think Kubernetes is really a way of handling all this complexity.

But complexity begets more complexity. I mean, if you think of Babbage's difference engine, which was the first mechanical computer, that led to valves, which led to transistors. And at each stage of complexity, solutions were being found to manage that complexity. But those solutions were creating more complexity in their own rights.

And I think that's what we're seeing happening with Kubernetes. Kubernetes is this way of solving complexity in the cloud and virtual machine and container space. But if you look at all the different projects and different vendors who are involved in the Cloud Native Computing Foundation landscape, you'll see that there is this whole new world of complexity being born as a result of the need to resolve the complexity of Kubernetes.

So I think thinking about Kubernetes needs to be part of the journey. And you can't just say, oh, we moved to Kubernetes. That's it. Now our project is done. Because things are going to evolve, and companies and developers and ops need to be aware that things are going to evolve and work out how all of these things are going to play together as things become more complex.

CRAIG BOX: Does the economic model change as more people come onto the market with experience and there are more prepackaged solutions around the space?

OWEN ROGERS: That is such a great question. So let's split this into two parts. So as the more experienced people enter the market, salaries for those people should go down because there is a greater labor force and each individual is less differentiated. However, that's not true if the demand for those skills rise.

Now, I think demand for Kubernetes skills is probably rising faster than the labor force is supplying them. So even though there are more experienced people coming out into the market, I think that it's getting harder to find those people. Now, prepackaged solutions are a way of dealing with some of this issue. Because if you have a managed service provider who offers to orchestrate and manage your Kubernetes estate, then they're going to get the benefit of aggregating many users' requirements into a smaller amount of people.

But I think it is going to be challenging to find the skills. And if I was going to a university now-- let's say I was 18, just out of high school and I was choosing my options-- I would have to wonder if it might be better training myself using cloud technology and follow in some cloud certifications and learn about Kubernetes rather than investing a lot of money in a university education.

CRAIG BOX: Given that you could have said the same thing 15 years ago with WSDL or SOAP, for example, and some of these things might not look so good in hindsight, do you feel that people should be investing in a technology of the day versus the fundamental underpinnings of all technology?

OWEN ROGERS: Cloud is very much an enterprise reality. I would argue that pretty much every company out there is doing something in cloud, whether it be infrastructure, platform, or software. But just because enterprises and companies are using a bit of cloud doesn't mean there's still not a huge amount of cloud out there yet to be adopted.

And I think that is the difference. With older technologies, there was a lot of talk about them and a lot of excitement about them. But it wasn't shown that they were being adopted now. And there also wasn't the evidence that there would be a need for adoption in the future. And I think the difference with cloud is it is being adopted. It has been adopted. But there are still lots of brownfield and greenfield opportunities for using cloud to add value to businesses. And for me, that is the power of cloud. It's a real thing that's here that has a lot of potential.

Now of course, you could argue about the fundamentals. And I think the fundamentals do matter. For me, I code scripts and things like that to work on data. And coding is a big deal. But also during my university education, I learned about the history of computing, which is interesting. But would that time have been better spent learning about AWS, Google, Microsoft, Oracle, IBM, or Alibaba, or anyone else out there? Probably yeah.

ADAM GLICK: Many of the folks that listen to this podcast are looking at the Kubernetes deployments that they have and looking to grow them. We know that almost all organizations have some sort of Kubernetes deployment, but many of them are still fairly small in comparison, like you're talking about of the entire IT department and entire IT spend. How should they be thinking about designing a cost model to help people understand what it looks like and where it makes sense for them to expand that?

OWEN ROGERS: The cost model is an interesting issue because I would argue that the first step to making a cost model would be to do the value model. So try and model the return on the investment of moving to Kubernetes. Now, if you are a small company and you want to start using containers, you have to work out how much is it going to cost to train those people? How much is it going to cost to get things up and running and to use it in practice?

Now, because a lot of it is open source, it might be really inexpensive. And in that case, I would argue maybe you don't need to do a cost model because the value proposition is so obvious in that you already have the skilled people. The technology is already there. You can adopt it with very little risk. And it will give you a lot of advantages.

Now if you're a bit larger or you're migrating something from traditional to containers, you might want to understand a bit more. And this is where a cost model is important. But again, don't be deflected from the value. What value will you gain from moving to containers? Will you be able to penetrate new markets quicker? Will you be able to improve your reputation by being able to update applications quicker? Work out what are the benefits of your proposed course of action and try and quantify them, even though I know it's incredibly difficult.

And then when it comes down to the cost elements, work out where your containers, where your Kubernetes deployments are going to sit. Now, most hyperscaler service providers, to a limited degree, offer some kind of container engines for free. You just pay for the underlying virtual machines. And it's crucial to understand all of the components that will go into that cost model. So, for example, in that one, you'd need to look at the cost of virtual machines, as well as bandwidth and storage.

But for me, the most critical part is to do a what-if analysis. Because you can do a cost model and make predictions. But how will those predictions change if things go better than planned or if they go worse than planned? So understand the costs in those different scenarios, but also understand the value. Because if your costs go out of control because things are going well, that doesn't really matter if you're making a huge amount of revenue in return.

CRAIG BOX: One of the ways you can trade cost for value is by choosing to make a system more or less reliable. You can scale it from one machine to many, from running in one zone to running in many, from running within only one region to running globally. And a lot of that comes down to how many nines you would like the service to be available for. There are exponential trade-offs required or extra cost required to get each extra nine. How should the business make the decision of how reliable they want to offer a service?

OWEN ROGERS: Risk is a measure, essentially, of the probability of something happening and the result of that small probability taking place. So let's say I had a test workload or a development workload, and the probability of failing was 1 in 10. But if it failed, it would cost me nothing. Well, in that case, I think the risk assessment would be that it's not actually worth spending a huge amount of money on making it available because even if it's not available, then it's not the end of the world.

Now let's say you have a mission-critical application with an ill-advised single point of failure. And if that single point of failure goes down, you'll lose a million dollars every day. Well, in that case, it probably is worth investing in because you should make sure you're getting the best range of availability without having to risk what's going to happen to your mission-critical application.

Now as you've said, the best way of dealing with this is to build across multiple regions, multiple availability zones. But I would say try and look at each workload in its individuality. Some workloads need that high level availability and should be duplicated, made redundant. Others don't need it and can just be run wherever is necessary.

For me, it's all about the hybrid understanding of workloads. I'm not saying applications need to run across on-prem and cloud environments. I'm just saying that workloads on an individual level will have different requirements, and it's best to work out which venue is best for each of them.

ADAM GLICK: Is the economics of Kubernetes different than that of previous IT infrastructure?

OWEN ROGERS: I think so. And that's partly because of the suitcase analogy. So in traditional infrastructure, you have a single operating system, typically. And on that operating system would run an application or, best, a couple of applications. And virtual machines were the first step in breaking up that server to get better utilization.

With Kubernetes, I would say it is a different model because you can squeeze more resources or squeeze more applications or microservices on a single server. But also those microservices can be more dynamic. They can grow and shrink and disappear from existence, meaning that you can compress more on the server, but also compress more at different times of the workload's requirements.

CRAIG BOX: As we start moving to a world where you can calculate the cost of a workload by simply saying, this is its size versus an amount you pay per usage, so functions as a service and then vendors charging you per container, for example. We're starting to see people offer services for optimizing the cost of Kubernetes environments. Some of it is looking at running a workload in a slightly cheaper place. And some of it is looking at condensing work into smaller numbers of machines in order to drive that cost down. Do you see that as being particularly relevant to the customers you talk to?

OWEN ROGERS: The customers I talk to generally have a desire to be able to move workloads and containers and pretty much any resources from venue to venue depending on specifications for cost against performance versus a bunch of other attributes. But most of those companies recognize that this is a distant dream and that there might be a time in the future when they can broker these resources across different venues to cost optimize. But at the moment, they're more worried about getting things running and trying to make it the most cost effective from day one.

So I think this cost optimization is relevant. But it's only relevant in terms of thinking of optimization, the whole estate. So how is the application's performance best balanced against cost? And how can this be done on a regular rather than on an ongoing basis?

CRAIG BOX: What about systems for analyzing bills or charging back to individual departments? Do you think that they will start driving people's behavior?

OWEN ROGERS: Most companies I talk to, again, have an aspiration of a chargeback system. So being able to say, well, marketing department, you spent this much on your IT. This is how much you owe us. I think in reality, very few companies have that culture of scalability. Now, some of that is down to just knowledge and expertise. The marketing department, even if they understand that they have to pay for resources, they probably don't know how to turn that technical resource knowledge into deliverables and into valuable things to the marketing department.

And also, there's the issue of cultural scalability-- so convincing all these different departments, including the finance department, that it is worth making a spontaneous decision to invest far more in virtual machines to address a retail option than to cut back on costs immediately when you see things spiraling out of control. So there is an aspiration for this chargeback. I'd say most companies have some idea of showback, but I wouldn't say it's hugely accurate at the moment.

ADAM GLICK: You've done a lot of work on cloud economics and Kubernetes economics. When you're working with companies, how do you help those people understand the cost of their IT? What is it that you do with them to help them understand their situation?

OWEN ROGERS: I would say the first thing to do is to understand what they are doing now. And I would say there's a general lack of clarity. And some of that is due to the entropy of cloud and technology in that they've had these estates for decades. All the people who have managed them have moved on. No one really knows how to manage them. And then to talk about migrating them to the cloud is a big step further in the first place.

And then you have the entropy of companies who are already in the cloud who've started up some test and dev workloads and then it spiraled out of control into production. So I would say the first step in optimizing and understanding their costs is just to understand what is out there already, who's in charge of it, and how much it matters to the business.

ADAM GLICK: Gotcha. And now a quick economics lightning round. Macro or micro?

OWEN ROGERS: Definitely micro, I think.

ADAM GLICK: Keynes or Hayek?

OWEN ROGERS: Definitely Keynes.

ADAM GLICK: Supply or demand side?

OWEN ROGERS: Supply side.

ADAM GLICK: CAPEX or OPEX?

OWEN ROGERS: That's a tough one. Would I rather buy a car or lease a car? OPEX, I think.

ADAM GLICK: Fixed price or spot?

OWEN ROGERS: Spot, definitely. In fact, I don't think that users are taking as much an advantage of spot as they could do.

ADAM GLICK: Gotcha. Thank you very much for coming on the show. And it's been absolutely fabulous to talk to you this week.

OWEN ROGERS: My pleasure. I hope it was useful.

ADAM GLICK: You can find Owen on Twitter, @owenrog, and you can find his latest work and the work from his firm at 451research.com.

[MUSIC PLAYING]

Thanks for listening. As always, if you've enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter @kubernetespod or reach us by email at kubernetespodcast@google.com.

CRAIG BOX: You can also check out our website at KubernetesPodcast.com where you will find transcripts and show notes. Until next time, take care.

ADAM GLICK: Catch you next week.

[MUSIC PLAYING]