Container Resources: The Kitchen Budget Story 🍳
Imagine you’re running a busy restaurant kitchen. Every chef (container) needs ingredients (CPU), counter space (memory), and fridge storage (ephemeral storage). Without proper budgets, chaos ensues—one chef hogs all the eggs, another takes the entire counter, and suddenly your kitchen grinds to a halt!
Kubernetes Resource Management is like being the smart kitchen manager who makes sure every chef gets what they need—and nobody takes more than their fair share.
🎯 The Big Picture
graph TD A["Container"] --> B["Requests"] A --> C["Limits"] B --> D["Guaranteed minimum"] C --> E["Maximum allowed"] D --> F["Scheduling decision"] E --> G["Enforcement"]
Requests = “I need at least THIS much” Limits = “I will NEVER use more than THIS”
1. Resource Requests and Limits
What’s a Request?
A request is your container saying: “Hey Kubernetes, I need at least this much to run properly.”
Think of it like reserving a table at a restaurant. You’re saying “I need a table for 4” — the restaurant won’t seat you at a 2-person table!
What’s a Limit?
A limit is the maximum your container can ever use. It’s like a credit card spending limit — even if you want to spend more, you can’t!
Simple Example
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"
Translation:
- “I need at least 128MB memory and 0.25 CPU cores”
- “Never let me use more than 256MB memory or 0.5 CPU cores”
Why Both?
| Scenario | What Happens |
|---|---|
| No request | Kubernetes might put you on a crowded node |
| No limit | Your app could eat all resources and crash others |
| Both set | Predictable, fair resource sharing! |
2. CPU and Memory Resources
CPU: Measured in Millicores
One CPU core = 1000 millicores (m)
| Value | Meaning |
|---|---|
100m |
0.1 CPU (10% of one core) |
500m |
0.5 CPU (half a core) |
1 |
1 full CPU core |
2 |
2 CPU cores |
Real Life Example:
cpu: "250m" # Your app gets 25% of a CPU
If your node has 4 CPUs, you could run 16 containers each requesting 250m!
Memory: Measured in Bytes
| Suffix | Meaning | Example |
|---|---|---|
Ki |
Kibibytes | 256Ki = 262,144 bytes |
Mi |
Mebibytes | 128Mi ≈ 134 MB |
Gi |
Gibibytes | 1Gi ≈ 1.07 GB |
Real Life Example:
memory: "512Mi" # About half a gigabyte
What Happens When Limits Are Hit?
| Resource | Behavior |
|---|---|
| CPU | Container gets throttled (slowed down) |
| Memory | Container gets OOMKilled (terminated!) |
⚠️ Important: Memory limits are HARD. If your app tries to use 300Mi when the limit is 256Mi, Kubernetes will kill it immediately!
3. Ephemeral Storage Resources
What Is It?
Ephemeral storage = temporary disk space your container uses. It includes:
- Container’s writable layer
- Logs
- Temporary files
- emptyDir volumes
Why Care?
Without limits, one container could fill up the entire node’s disk!
Example
resources:
requests:
ephemeral-storage: "1Gi"
limits:
ephemeral-storage: "2Gi"
If your container writes more than 2Gi of temp files, Kubernetes evicts it!
4. QoS Classes
Kubernetes automatically assigns a Quality of Service class to every Pod. Think of it like airline seating!
graph TD A["QoS Classes"] --> B["Guaranteed"] A --> C["Burstable"] A --> D["BestEffort"] B --> E["First Class ✈️"] C --> F["Economy Plus"] D --> G["Standby"]
The Three Classes
| Class | Requirements | Eviction Priority |
|---|---|---|
| Guaranteed | Requests = Limits for ALL containers | Last to be evicted |
| Burstable | At least one request or limit set | Middle priority |
| BestEffort | No requests or limits at all | First to be evicted |
How to Get Guaranteed QoS
resources:
requests:
memory: "256Mi"
cpu: "500m"
limits:
memory: "256Mi" # Same as request!
cpu: "500m" # Same as request!
💡 Pro Tip: For critical workloads, always aim for Guaranteed QoS!
When Does Eviction Happen?
When a node runs low on memory, Kubernetes starts evicting Pods:
- BestEffort pods go first
- Burstable pods using more than requested go next
- Guaranteed pods are protected longest
5. LimitRanges
The Problem
What if a developer forgets to set limits? Or sets crazy high values?
The Solution
LimitRange is like a template that enforces rules at the namespace level.
apiVersion: v1
kind: LimitRange
metadata:
name: mem-limit-range
namespace: dev-team
spec:
limits:
- default:
memory: "512Mi"
cpu: "500m"
defaultRequest:
memory: "256Mi"
cpu: "250m"
max:
memory: "1Gi"
cpu: "1"
min:
memory: "64Mi"
cpu: "50m"
type: Container
What This Does
| Setting | Effect |
|---|---|
default |
Auto-applied limit if none specified |
defaultRequest |
Auto-applied request if none specified |
max |
Maximum allowed limit |
min |
Minimum required resources |
Real-World Scenario
Without LimitRange:
- Developer creates Pod with 100Gi memory limit
- Pod gets scheduled
- Node crashes!
With LimitRange:
- Developer tries 100Gi limit
- Kubernetes says “Nope! Max is 1Gi”
- Node stays healthy!
6. ResourceQuotas
LimitRange vs ResourceQuota
| LimitRange | ResourceQuota |
|---|---|
| Per container/pod | Per namespace total |
| “Each pizza slice max 2 toppings” | “Kitchen has only 50 toppings total” |
What Can You Quota?
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-quota
namespace: dev-team
spec:
hard:
requests.cpu: "4"
requests.memory: "8Gi"
limits.cpu: "8"
limits.memory: "16Gi"
pods: "20"
Translation
This namespace can have:
- Max 20 pods
- Total CPU requests up to 4 cores
- Total memory requests up to 8Gi
- Total CPU limits up to 8 cores
- Total memory limits up to 16Gi
What Happens When Quota Is Hit?
Error: exceeded quota: compute-quota
requested: requests.cpu=500m
used: requests.cpu=4
limited: requests.cpu=4
New pods are rejected until existing ones are deleted!
7. Requests vs Limits: The Complete Picture
Side-by-Side Comparison
| Aspect | Request | Limit |
|---|---|---|
| Purpose | Scheduling guarantee | Resource cap |
| When used | Deciding where to place pod | Enforcing maximum usage |
| What if exceeded | N/A (it’s a minimum) | Throttle (CPU) or Kill (Memory) |
| Affects QoS | Yes | Yes |
| Required? | No (but recommended) | No (but recommended) |
The Sweet Spot
graph LR A["Too Low Request"] --> B["Pod starves"] C["Too High Request"] --> D["Wasted resources"] E["No Limit"] --> F["Resource hog risk"] G["Limit = Request"] --> H["Guaranteed QoS"]
Best Practices
- Always set both requests AND limits
- Start conservative, increase based on monitoring
- Request = typical usage, Limit = peak usage
- Use LimitRanges to enforce team defaults
- Use ResourceQuotas to protect shared clusters
Quick Decision Guide
Is it production critical?
├── Yes → Set Request = Limit (Guaranteed QoS)
└── No → Set Request < Limit (allow bursting)
Is it a shared cluster?
├── Yes → Always use ResourceQuotas
└── No → Still use them (good habit!)
🎉 You Made It!
You now understand:
✅ Requests = minimum guaranteed resources ✅ Limits = maximum allowed resources ✅ CPU = measured in millicores (throttled if exceeded) ✅ Memory = measured in Mi/Gi (killed if exceeded) ✅ Ephemeral Storage = temp disk space (evicted if exceeded) ✅ QoS Classes = Guaranteed > Burstable > BestEffort ✅ LimitRanges = per-container rules in a namespace ✅ ResourceQuotas = total resource caps for a namespace
You’re now ready to manage resources like a pro Kubernetes administrator! 🚀
