Cost optimisation on Kubernetes
Key takeaways from Google's report
I read the new report by Google on Kubernetes cost management and took some notes. We all have to work on cost optimisation, so sharing this post based on what I gathered.
It talks about why saving on costs is super critical now, because of the economy and all. Then it goes deep into how Kubernetes, if not managed well, can really make your budget bleed.
Sounds about right. So, what solutions does the report suggest?
The report mentions four golden signals for cost management on Kubernetes. First up, workload rightsizing is key. You have to make sure the resources match the workload perfectly. No more, no less. This alone can save a ton.
But how do I rightsize the workloads?
You will have to look at the actual resource consumption by the workload, understand its patterns and then configure “requests” in your deployment configurations to allocate “just the right amount” of resources.
Makes sense. What's the next signal?
They talk about demand-based downscaling. It's like having a smart home that turns off the AC when you're not around. Kubernetes can scale down resources automatically when the demand is low. It's a neat trick to keep costs in check.
How do I do this?
You downscale them when not in use. You could have workloads where there are hardly any users at certain times of the day or week, during those times, you scale down your workloads AND your cluster as well. In context of Kubernetes, you use Horizontal Pod Autoscaling (HPA), Vertical Pod Autoscaling (VPA) and Cluster Autoscaling (CA) as your levers to implement the scaling.
I like that. What else?
Then there's cluster bin packing. It's about arranging your workloads to use as few resources as possible. Think of it like packing for a vacation, pack nicely and you will fit more stuff and need less bags. In context of K8s, you have to understand your workloads, then implement proper requests and limits configuration - so you fit them on your nodes optimally and use less nodes as a result.
That looks like a reliability risk. Aren’t we spreading it out too thin?
That is correct. You will of course have to use Pod Disruption Budgets, Affinity Rules, Topology Spread Constraints, etc. to ensure that your pod allocation is done in a way that would not lead to a downtime/reliability risk.
Hmmm. And the last signal?
Discount coverage. Talk to your cloud provider to check discounted rates. On AWS, use SPOT or Reserved Instances after analysis and save costs that way.
Got it. It's about being strategic with your resources and spending.
Exactly. They mention that the key takeaway is that your “request” configurations should be optimal.
Summary
Understand your Kubernetes workloads and then use Kubernetes features to optimally allocate resources.
Understand the usage patterns of your services and down-scale whenever you can.
Work with your cloud vendor to get discounted prices.
There is more in the report and I recommend you read it.

