diff --git a/index.html b/index.html
index c80470b..24f6e0f 100644
--- a/index.html
+++ b/index.html
@@ -223,5 +223,5 @@
When optimizing our cluster costs we want to focus on all of these areas iteratively - in order to keep our clusters as cost-effective and performant as needed.
Now let's explain each of these focus areas in more detail.
Pinpointing the exact memory and CPU requests for our pods is hard - it requires observing the application behaviour under production load over a significant time period.
-Therefore most engineers prefer to err towards overprovisioning - i.e setting requests much higher than the application will ever use.
-This delta between what an app actually uses and the number set in the pod container requests is wasted resources.
-This leads to a large amount of allocated but unutilized resources all across the cluster. Just imagine your cluster runs 200 pods and each of them requests 100Mb more memory than it actually uses. Altogether you'll have 20Gb of wasted RAM across the cluster. These resources will be provisioned, paid for, but never actually used.
+This waste comes from over-provisioning the containers in our pods.
+Read on to understand the reasons and perils of over-provsioning.
Kubernetes comes with a promise of automatic bin packing. I.e - it is supposed to fit the largest possible amount of pods on every node in the cluster. But this is again dependent on engineers correctly defining 1) resource requests and 2) node sizes. Even with smart and well-tuned autoscaling tools like Karpenter this doesn't always work and we find ourselves with nodes that are more than half empty - with resources that were neither requested nor utilized. All these are idle resources and taking care of reducing them is an important focus area of Kubernetes Cost Optimization.
Pinpointing the exact memory and CPU requests for our pods is hard - it requires observing the application behaviour under production load over a significant time period.
+Therefore most engineers prefer to err towards overprovisioning - i.e setting requests much higher than the application will ever use.
+This leads to a large amount of allocated but unutilized resources all across the cluster. Just imagine your cluster runs 200 pods and each of them requests 100Mb more memory than it actually uses. Altogether you'll have 20Gb of wasted RAM across the cluster. These resources will be provisioned, paid for, but never actually used.
diff --git a/search/search_index.json b/search/search_index.json
index 0c618fb..b41cb6f 100644
--- a/search/search_index.json
+++ b/search/search_index.json
@@ -1 +1 @@
-{"config":{"indexing":"full","lang":["en"],"min_search_length":3,"prebuild_index":false,"separator":"[\\s\\-]+"},"docs":[{"location":"","text":"Kubernetes Cost Optimization Welcome to the Kubernetes Cost Optimization Guide This website provides the learning materials for engineers wishing to optimize their K8s cluster costs without compromizing performance and reliability. If you find issues with any of the website materials - please send us a note. Browse our guides: The Golden Signals of Kubernetes Cost Optimization The 4 Areas of Kubernetes Cost Optimization Balancing Cost with Performance and Reliability Kubernetes Workload Rightsizing Pod Autoscaling Cluster Autoscaling Leveraging Cloud Discounts","title":"Home"},{"location":"#kubernetes-cost-optimization","text":"","title":"Kubernetes Cost Optimization"},{"location":"#welcome-to-the-kubernetes-cost-optimization-guide","text":"This website provides the learning materials for engineers wishing to optimize their K8s cluster costs without compromizing performance and reliability. If you find issues with any of the website materials - please send us a note. Browse our guides: The Golden Signals of Kubernetes Cost Optimization The 4 Areas of Kubernetes Cost Optimization Balancing Cost with Performance and Reliability Kubernetes Workload Rightsizing Pod Autoscaling Cluster Autoscaling Leveraging Cloud Discounts","title":"Welcome to the Kubernetes Cost Optimization Guide"},{"location":"cloud-discounts/","text":"Leveraging Cloud Discounts Spot VMs and Best Effort Pods Combining Spot and On-Demand Fleets Applying Saving Plans to your Kubernetes clusters","title":"Leveraging Cloud Discounts"},{"location":"cloud-discounts/#leveraging-cloud-discounts","text":"Spot VMs and Best Effort Pods Combining Spot and On-Demand Fleets Applying Saving Plans to your Kubernetes clusters","title":"Leveraging Cloud Discounts"},{"location":"cluster-autoscaling/","text":"Cluster Autoscaling Cluster-autoscaler Karpenter Improving Kubernetes Bin Packing","title":"Cluster Autoscaling"},{"location":"cluster-autoscaling/#cluster-autoscaling","text":"Cluster-autoscaler Karpenter Improving Kubernetes Bin Packing","title":"Cluster Autoscaling"},{"location":"cost-perf-r9y/","text":"Balancing Cost with Performance and Reliability Kubernetes cost optimization comes down to pinpointing the correct resource allocations and auto-scaling factors for our workloads. But \"correct\" in this context doesn't mean \"the least possible amount of resources\". It's a delicate interplay of cost vs. performance vs.reliability. In order to run our clusters in the most cost-effective way without compromising either performance or reliability it's vitally inportant to understand the Pod QoS model and the implications of PodDisruptionBudget. Understanding the Pod QoS Model https://services.google.com/fh/files/misc/state_of_kubernetes_cost_optimization.pdf PodDisruptionBudget and application disruption","title":"Balancing Cost with Performance and Reliability"},{"location":"cost-perf-r9y/#balancing-cost-with-performance-and-reliability","text":"Kubernetes cost optimization comes down to pinpointing the correct resource allocations and auto-scaling factors for our workloads. But \"correct\" in this context doesn't mean \"the least possible amount of resources\". It's a delicate interplay of cost vs. performance vs.reliability. In order to run our clusters in the most cost-effective way without compromising either performance or reliability it's vitally inportant to understand the Pod QoS model and the implications of PodDisruptionBudget. Understanding the Pod QoS Model https://services.google.com/fh/files/misc/state_of_kubernetes_cost_optimization.pdf PodDisruptionBudget and application disruption","title":"Balancing Cost with Performance and Reliability"},{"location":"golden-signals/","text":"The Golden Signals The 4 \"golden signals\" of Kubernetes Cost Optimization as defined in a whitepaper released by Google Cloud in June 2023. Signal Group 1.Workload Rightsizing Resources 2.Demand-based Downscaling 3.Cluster Bin Packing 4.Cloud Provider Discount Coverage Cloud Discounts These signals help us apply and measure cost optimization for Kubernets clusters. The 3 signals in the resources group apply to all clusters - be it on-prem or on-cloud. The cloud discounts naturally only apply to cloud-based managed clusters, where it is very important to pinpoint the instance types and reservation level of our cluster nodes. Let's explain each signal in a bit more detail. The Resources Group Signal Explanation Workload Rightsizing Refers to our ability to allocate the amount of resources that the workloads actually need and adapt resource requests and limits as application requirements change. Demand based autoscaling Measures the capacity of developers and platform admins to make clusters scale down during off-peak hours. Cluster bin packing Refers to our ability to measure and utilize the CPU and memory of each node in the most effective and reliable way through correct Pod placement. The Cloud Discounts group Signal Explanation Cloud Discount Coverage Refers to leveraging cloud VM instances that offer discounts, such as Spot VMs, as well as the ability of budget owners and FinOps professionals to take advantage of long-term continuous use discounts offered by cloud providers.","title":"The Golden Signals of Kubernetes Cost Optimization"},{"location":"golden-signals/#the-golden-signals","text":"The 4 \"golden signals\" of Kubernetes Cost Optimization as defined in a whitepaper released by Google Cloud in June 2023. Signal Group 1.Workload Rightsizing Resources 2.Demand-based Downscaling 3.Cluster Bin Packing 4.Cloud Provider Discount Coverage Cloud Discounts These signals help us apply and measure cost optimization for Kubernets clusters. The 3 signals in the resources group apply to all clusters - be it on-prem or on-cloud. The cloud discounts naturally only apply to cloud-based managed clusters, where it is very important to pinpoint the instance types and reservation level of our cluster nodes. Let's explain each signal in a bit more detail.","title":"The Golden Signals"},{"location":"golden-signals/#the-resources-group","text":"Signal Explanation Workload Rightsizing Refers to our ability to allocate the amount of resources that the workloads actually need and adapt resource requests and limits as application requirements change. Demand based autoscaling Measures the capacity of developers and platform admins to make clusters scale down during off-peak hours. Cluster bin packing Refers to our ability to measure and utilize the CPU and memory of each node in the most effective and reliable way through correct Pod placement.","title":"The Resources Group"},{"location":"golden-signals/#the-cloud-discounts-group","text":"Signal Explanation Cloud Discount Coverage Refers to leveraging cloud VM instances that offer discounts, such as Spot VMs, as well as the ability of budget owners and FinOps professionals to take advantage of long-term continuous use discounts offered by cloud providers.","title":"The Cloud Discounts group"},{"location":"over-under-idle-waste/","text":"The 4 Focus Areas When starting out with Kubernetes Cost Optimization it's important to understand what to focus on. Redundant costs come from 2 main sources: wasted resources and idle resources . Both of these are usually caused by over-provisioning , intentional or unintentional. On the other hand - thoughtless cost reduction activity can lead to under-provisioning , which causes performance and reliability issues. When optimizing our cluster costs we want to focus on all of these areas iteratively - in order to keep our clusters as cost-effective and performant as needed. Now let's explain each of these focus areas in more detail. Wasted Resources Pinpointing the exact memory and CPU requests for our pods is hard - it requires observing the application behaviour under production load over a significant time period. Therefore most engineers prefer to err towards overprovisioning - i.e setting requests much higher than the application will ever use. This delta between what an app actually uses and the number set in the pod container requests is wasted resources . This leads to a large amount of allocated but unutilized resources all across the cluster. Just imagine your cluster runs 200 pods and each of them requests 100Mb more memory than it actually uses. Altogether you'll have 20Gb of wasted RAM across the cluster. These resources will be provisioned, paid for, but never actually used. Idle Resources Kubernetes comes with a promise of automatic bin packing. I.e - it is supposed to fit the largest possible amount of pods on every node in the cluster. But this is again dependent on engineers correctly defining 1) resource requests and 2) node sizes. Even with smart and well-tuned autoscaling tools like Karpenter this doesn't always work and we find ourselves with nodes that are more than half empty - with resources that were neither requested nor utilized. All these are idle resources and taking care of reducing them is an important focus area of Kubernetes Cost Optimization. Over Provisioning Under Provisioning","title":"The 4 Focus Areas of Kubernetes Cost Optimization"},{"location":"over-under-idle-waste/#the-4-focus-areas","text":"When starting out with Kubernetes Cost Optimization it's important to understand what to focus on. Redundant costs come from 2 main sources: wasted resources and idle resources . Both of these are usually caused by over-provisioning , intentional or unintentional. On the other hand - thoughtless cost reduction activity can lead to under-provisioning , which causes performance and reliability issues. When optimizing our cluster costs we want to focus on all of these areas iteratively - in order to keep our clusters as cost-effective and performant as needed. Now let's explain each of these focus areas in more detail.","title":"The 4 Focus Areas"},{"location":"over-under-idle-waste/#wasted-resources","text":"Pinpointing the exact memory and CPU requests for our pods is hard - it requires observing the application behaviour under production load over a significant time period. Therefore most engineers prefer to err towards overprovisioning - i.e setting requests much higher than the application will ever use. This delta between what an app actually uses and the number set in the pod container requests is wasted resources . This leads to a large amount of allocated but unutilized resources all across the cluster. Just imagine your cluster runs 200 pods and each of them requests 100Mb more memory than it actually uses. Altogether you'll have 20Gb of wasted RAM across the cluster. These resources will be provisioned, paid for, but never actually used.","title":"Wasted Resources"},{"location":"over-under-idle-waste/#idle-resources","text":"Kubernetes comes with a promise of automatic bin packing. I.e - it is supposed to fit the largest possible amount of pods on every node in the cluster. But this is again dependent on engineers correctly defining 1) resource requests and 2) node sizes. Even with smart and well-tuned autoscaling tools like Karpenter this doesn't always work and we find ourselves with nodes that are more than half empty - with resources that were neither requested nor utilized. All these are idle resources and taking care of reducing them is an important focus area of Kubernetes Cost Optimization.","title":"Idle Resources"},{"location":"over-under-idle-waste/#over-provisioning","text":"","title":"Over Provisioning"},{"location":"over-under-idle-waste/#under-provisioning","text":"","title":"Under Provisioning"},{"location":"pod-autoscaling/","text":"Pod Autoscaling Horizontal HPA CPU Memory Custom metrics KEDA Vertical VPA Goldilocks","title":"Pod Autoscaling"},{"location":"pod-autoscaling/#pod-autoscaling","text":"Horizontal HPA CPU Memory Custom metrics KEDA Vertical VPA Goldilocks","title":"Pod Autoscaling"},{"location":"rightsizing/","text":"Kubernetes Workload Rightsizing Requests Memory CPU Limits Memory CPU Understanding CPU throttling Defining resource guardrails LimitRange NamespaceQuota","title":"Kubernetes Workload Rightsizing"},{"location":"rightsizing/#kubernetes-workload-rightsizing","text":"Requests Memory CPU Limits Memory CPU Understanding CPU throttling Defining resource guardrails LimitRange NamespaceQuota","title":"Kubernetes Workload Rightsizing"}]}
\ No newline at end of file
+{"config":{"indexing":"full","lang":["en"],"min_search_length":3,"prebuild_index":false,"separator":"[\\s\\-]+"},"docs":[{"location":"","text":"Kubernetes Cost Optimization Welcome to the Kubernetes Cost Optimization Guide This website provides the learning materials for engineers wishing to optimize their K8s cluster costs without compromizing performance and reliability. If you find issues with any of the website materials - please send us a note. Browse our guides: The Golden Signals of Kubernetes Cost Optimization The 4 Areas of Kubernetes Cost Optimization Balancing Cost with Performance and Reliability Kubernetes Workload Rightsizing Pod Autoscaling Cluster Autoscaling Leveraging Cloud Discounts","title":"Home"},{"location":"#kubernetes-cost-optimization","text":"","title":"Kubernetes Cost Optimization"},{"location":"#welcome-to-the-kubernetes-cost-optimization-guide","text":"This website provides the learning materials for engineers wishing to optimize their K8s cluster costs without compromizing performance and reliability. If you find issues with any of the website materials - please send us a note. Browse our guides: The Golden Signals of Kubernetes Cost Optimization The 4 Areas of Kubernetes Cost Optimization Balancing Cost with Performance and Reliability Kubernetes Workload Rightsizing Pod Autoscaling Cluster Autoscaling Leveraging Cloud Discounts","title":"Welcome to the Kubernetes Cost Optimization Guide"},{"location":"cloud-discounts/","text":"Leveraging Cloud Discounts Spot VMs and Best Effort Pods Combining Spot and On-Demand Fleets Applying Saving Plans to your Kubernetes clusters","title":"Leveraging Cloud Discounts"},{"location":"cloud-discounts/#leveraging-cloud-discounts","text":"Spot VMs and Best Effort Pods Combining Spot and On-Demand Fleets Applying Saving Plans to your Kubernetes clusters","title":"Leveraging Cloud Discounts"},{"location":"cluster-autoscaling/","text":"Cluster Autoscaling Cluster-autoscaler Karpenter Improving Kubernetes Bin Packing","title":"Cluster Autoscaling"},{"location":"cluster-autoscaling/#cluster-autoscaling","text":"Cluster-autoscaler Karpenter Improving Kubernetes Bin Packing","title":"Cluster Autoscaling"},{"location":"cost-perf-r9y/","text":"Balancing Cost with Performance and Reliability Kubernetes cost optimization comes down to pinpointing the correct resource allocations and auto-scaling factors for our workloads. But \"correct\" in this context doesn't mean \"the least possible amount of resources\". It's a delicate interplay of cost vs. performance vs.reliability. In order to run our clusters in the most cost-effective way without compromising either performance or reliability it's vitally inportant to understand the Pod QoS model and the implications of PodDisruptionBudget. Understanding the Pod QoS Model https://services.google.com/fh/files/misc/state_of_kubernetes_cost_optimization.pdf PodDisruptionBudget and application disruption","title":"Balancing Cost with Performance and Reliability"},{"location":"cost-perf-r9y/#balancing-cost-with-performance-and-reliability","text":"Kubernetes cost optimization comes down to pinpointing the correct resource allocations and auto-scaling factors for our workloads. But \"correct\" in this context doesn't mean \"the least possible amount of resources\". It's a delicate interplay of cost vs. performance vs.reliability. In order to run our clusters in the most cost-effective way without compromising either performance or reliability it's vitally inportant to understand the Pod QoS model and the implications of PodDisruptionBudget. Understanding the Pod QoS Model https://services.google.com/fh/files/misc/state_of_kubernetes_cost_optimization.pdf PodDisruptionBudget and application disruption","title":"Balancing Cost with Performance and Reliability"},{"location":"golden-signals/","text":"The Golden Signals The 4 \"golden signals\" of Kubernetes Cost Optimization as defined in a whitepaper released by Google Cloud in June 2023. Signal Group 1.Workload Rightsizing Resources 2.Demand-based Downscaling 3.Cluster Bin Packing 4.Cloud Provider Discount Coverage Cloud Discounts These signals help us apply and measure cost optimization for Kubernets clusters. The 3 signals in the resources group apply to all clusters - be it on-prem or on-cloud. The cloud discounts naturally only apply to cloud-based managed clusters, where it is very important to pinpoint the instance types and reservation level of our cluster nodes. Let's explain each signal in a bit more detail. The Resources Group Signal Explanation Workload Rightsizing Refers to our ability to allocate the amount of resources that the workloads actually need and adapt resource requests and limits as application requirements change. Demand based autoscaling Measures the capacity of developers and platform admins to make clusters scale down during off-peak hours. Cluster bin packing Refers to our ability to measure and utilize the CPU and memory of each node in the most effective and reliable way through correct Pod placement. The Cloud Discounts group Signal Explanation Cloud Discount Coverage Refers to leveraging cloud VM instances that offer discounts, such as Spot VMs, as well as the ability of budget owners and FinOps professionals to take advantage of long-term continuous use discounts offered by cloud providers.","title":"The Golden Signals of Kubernetes Cost Optimization"},{"location":"golden-signals/#the-golden-signals","text":"The 4 \"golden signals\" of Kubernetes Cost Optimization as defined in a whitepaper released by Google Cloud in June 2023. Signal Group 1.Workload Rightsizing Resources 2.Demand-based Downscaling 3.Cluster Bin Packing 4.Cloud Provider Discount Coverage Cloud Discounts These signals help us apply and measure cost optimization for Kubernets clusters. The 3 signals in the resources group apply to all clusters - be it on-prem or on-cloud. The cloud discounts naturally only apply to cloud-based managed clusters, where it is very important to pinpoint the instance types and reservation level of our cluster nodes. Let's explain each signal in a bit more detail.","title":"The Golden Signals"},{"location":"golden-signals/#the-resources-group","text":"Signal Explanation Workload Rightsizing Refers to our ability to allocate the amount of resources that the workloads actually need and adapt resource requests and limits as application requirements change. Demand based autoscaling Measures the capacity of developers and platform admins to make clusters scale down during off-peak hours. Cluster bin packing Refers to our ability to measure and utilize the CPU and memory of each node in the most effective and reliable way through correct Pod placement.","title":"The Resources Group"},{"location":"golden-signals/#the-cloud-discounts-group","text":"Signal Explanation Cloud Discount Coverage Refers to leveraging cloud VM instances that offer discounts, such as Spot VMs, as well as the ability of budget owners and FinOps professionals to take advantage of long-term continuous use discounts offered by cloud providers.","title":"The Cloud Discounts group"},{"location":"over-under-idle-waste/","text":"The 4 Focus Areas When starting out with Kubernetes Cost Optimization it's important to understand what to focus on. Redundant costs come from 2 main sources: wasted resources and idle resources . Both of these are usually caused by over-provisioning , intentional or unintentional. On the other hand - thoughtless cost reduction activity can lead to under-provisioning , which causes performance and reliability issues. When optimizing our cluster costs we want to focus on all of these areas iteratively - in order to keep our clusters as cost-effective and performant as needed. Now let's explain each of these focus areas in more detail. Wasted Resources Wasted resources are the resources that have been allocated but not utilized. In most unoptimizaed clusters we're observing up to 50% of waste, which translates to thousands of dollars or euros monthly. This waste comes from over-provisioning the containers in our pods. Read on to understand the reasons and perils of over-provsioning. Idle Resources Kubernetes comes with a promise of automatic bin packing. I.e - it is supposed to fit the largest possible amount of pods on every node in the cluster. But this is again dependent on engineers correctly defining 1) resource requests and 2) node sizes. Even with smart and well-tuned autoscaling tools like Karpenter this doesn't always work and we find ourselves with nodes that are more than half empty - with resources that were neither requested nor utilized. All these are idle resources and taking care of reducing them is an important focus area of Kubernetes Cost Optimization. Over Provisioning Pinpointing the exact memory and CPU requests for our pods is hard - it requires observing the application behaviour under production load over a significant time period. Therefore most engineers prefer to err towards overprovisioning - i.e setting requests much higher than the application will ever use. This leads to a large amount of allocated but unutilized resources all across the cluster. Just imagine your cluster runs 200 pods and each of them requests 100Mb more memory than it actually uses. Altogether you'll have 20Gb of wasted RAM across the cluster. These resources will be provisioned, paid for, but never actually used. Under Provisioning","title":"The 4 Focus Areas of Kubernetes Cost Optimization"},{"location":"over-under-idle-waste/#the-4-focus-areas","text":"When starting out with Kubernetes Cost Optimization it's important to understand what to focus on. Redundant costs come from 2 main sources: wasted resources and idle resources . Both of these are usually caused by over-provisioning , intentional or unintentional. On the other hand - thoughtless cost reduction activity can lead to under-provisioning , which causes performance and reliability issues. When optimizing our cluster costs we want to focus on all of these areas iteratively - in order to keep our clusters as cost-effective and performant as needed. Now let's explain each of these focus areas in more detail.","title":"The 4 Focus Areas"},{"location":"over-under-idle-waste/#wasted-resources","text":"Wasted resources are the resources that have been allocated but not utilized. In most unoptimizaed clusters we're observing up to 50% of waste, which translates to thousands of dollars or euros monthly. This waste comes from over-provisioning the containers in our pods. Read on to understand the reasons and perils of over-provsioning.","title":"Wasted Resources"},{"location":"over-under-idle-waste/#idle-resources","text":"Kubernetes comes with a promise of automatic bin packing. I.e - it is supposed to fit the largest possible amount of pods on every node in the cluster. But this is again dependent on engineers correctly defining 1) resource requests and 2) node sizes. Even with smart and well-tuned autoscaling tools like Karpenter this doesn't always work and we find ourselves with nodes that are more than half empty - with resources that were neither requested nor utilized. All these are idle resources and taking care of reducing them is an important focus area of Kubernetes Cost Optimization.","title":"Idle Resources"},{"location":"over-under-idle-waste/#over-provisioning","text":"Pinpointing the exact memory and CPU requests for our pods is hard - it requires observing the application behaviour under production load over a significant time period. Therefore most engineers prefer to err towards overprovisioning - i.e setting requests much higher than the application will ever use. This leads to a large amount of allocated but unutilized resources all across the cluster. Just imagine your cluster runs 200 pods and each of them requests 100Mb more memory than it actually uses. Altogether you'll have 20Gb of wasted RAM across the cluster. These resources will be provisioned, paid for, but never actually used.","title":"Over Provisioning"},{"location":"over-under-idle-waste/#under-provisioning","text":"","title":"Under Provisioning"},{"location":"pod-autoscaling/","text":"Pod Autoscaling Horizontal HPA CPU Memory Custom metrics KEDA Vertical VPA Goldilocks","title":"Pod Autoscaling"},{"location":"pod-autoscaling/#pod-autoscaling","text":"Horizontal HPA CPU Memory Custom metrics KEDA Vertical VPA Goldilocks","title":"Pod Autoscaling"},{"location":"rightsizing/","text":"Kubernetes Workload Rightsizing Requests Memory CPU Limits Memory CPU Understanding CPU throttling Defining resource guardrails LimitRange NamespaceQuota","title":"Kubernetes Workload Rightsizing"},{"location":"rightsizing/#kubernetes-workload-rightsizing","text":"Requests Memory CPU Limits Memory CPU Understanding CPU throttling Defining resource guardrails LimitRange NamespaceQuota","title":"Kubernetes Workload Rightsizing"}]}
\ No newline at end of file
diff --git a/sitemap.xml.gz b/sitemap.xml.gz
index 6af75e6..e16a519 100644
Binary files a/sitemap.xml.gz and b/sitemap.xml.gz differ