Back to Blog
Kubernetes
DevOps
Infrastructure
Should You Use CPU Limits in Kubernetes Production?
February 18, 2026
4 min read read
The post is basically asking:
- Is the “stop using CPU limits” advice still valid?
- Do you use CPU limits in production?
Short answer?
Yes — the core mechanics behind that article are still valid. But whether you should use CPU limits depends heavily on your workload and your cluster philosophy.
Let’s break this down properly.
---
## 🔧 What CPU Limits Actually Do
In Kubernetes:
- **CPU request** → affects scheduling (how much CPU the pod is guaranteed for placement).
- **CPU limit** → enforces a hard cap via CFS quota (Linux Completely Fair Scheduler).
If a container hits its CPU limit, it gets throttled.
Not slowed down gently.
Throttled.
That’s the key detail.
---
## 💥 Why People Say “Stop Using CPU Limits”
The argument from the Robusta article is based on real behavior:
When you set CPU limits:
- Pods can get throttled even when the node still has idle CPU
- Latency spikes happen
- Performance becomes unpredictable under burst
That hasn’t changed. The kernel mechanics are the same.
And in the thread, one responder confirms:
> “Yes, still valid. The described mechanics have not changed.”
That’s accurate.
If your workload is bursty (which most web services are), CPU limits can hurt more than help.
---
## 🧠 The Production Reality
The real answer is not “always use” or “never use.”
It’s about intent.
### Case 1 — Burstable Web Services (Most Apps)
Example:
- APIs
- Frontends
- Event consumers
- Typical SaaS workloads
These usually:
- Idle most of the time
- Spike under load
- Benefit from borrowing unused CPU
For these?
Many production clusters run:
- ✅ CPU requests set
- ❌ No CPU limits
Why?
Because you want pods to burst when CPU is available.
Memory is different — you almost always set memory limits.
CPU is elastic.
---
### Case 2 — CPU-Heavy Batch Jobs
Example:
- Data processing
- Video encoding
- ML workloads
- CI/CD runners
These:
- Hammer CPU constantly
- Can starve neighbors
- Don’t benefit from bursting
In this case?
CPU limits can absolutely make sense.
Even one commenter says they use limits for specific workloads or CI runners .
That’s a common pattern.
---
## 🏷 What About QoS?
Someone asked:
> any real reasons for qos in prod?
QoS classes matter when nodes are under pressure.
If you set:
- request == limit → Guaranteed
- request only → Burstable
- neither → BestEffort
In most real-world clusters:
- Burstable is perfectly fine
- Guaranteed is useful for critical workloads
- BestEffort should be avoided in prod
But QoS isn’t a reason to blindly add CPU limits.
QoS primarily matters more for memory eviction than CPU throttling.
---
## 🚨 The Subtle Gotcha: Throttling ≠ Fairness
People often think CPU limits “protect” the cluster.
They don’t.
They only throttle the container itself.
They don’t redistribute fairly across workloads in the way people imagine.
If your cluster is only using 30% CPU (like one commenter mentioned) , limits are doing nothing except potentially harming bursts.
---
## 📊 So What Do Mature Teams Actually Do?
Common production patterns in 2026:
### Pattern A — Modern SaaS
- CPU requests set
- No CPU limits
- Memory requests + limits set
- HPA based on CPU or metrics
This is extremely common now.
---
### Pattern B — Mixed Workloads
- No CPU limits for web workloads
- CPU limits for:
Batch jobs
Data processing
CI runners
Selective usage.
---
### Pattern C — Multi-Tenant Clusters
If multiple teams share a cluster and trust is low:
- CPU limits used as guardrails
- Often combined with ResourceQuota
This is more governance-driven than performance-driven.
---
## 🎯 My Direct Answer
Is the article still valid?
Yes.
The kernel behavior has not changed.
Do I use CPU limits?
- Not for bursty production services.
- Yes for heavy, long-running CPU jobs.
- Always set memory limits.
- Always set CPU requests.
The key mistake isn’t using limits.
It’s using them everywhere by default without understanding throttling.
---
Keep Exploring
It Works... But It Feels Wrong - The Real Way to Run a Java Monolith on Kubernetes Without Breaking Your Brain
A practical production guide to running a Java monolith on Kubernetes without fragile NodePort duct tape.
Kubernetes Isn’t Your Load Balancer — It’s the Puppet Master Pulling the Strings
Kubernetes orchestrates load balancers, but does not replace them; this post explains what actually handles production traffic.
We Thought Kubernetes Would Save Us - The Production Failures No One Puts on the Conference Slides
A field report on real Kubernetes production failures and the human factors that trigger them.
Zero-Downtime Deployments Without Kubernetes: Proven Approaches
Kubernetes is not the only way to achieve zero-downtime deployments. This article covers proven alternatives such as load balancers, blue-green rollout patterns, and graceful shutdown strategies.