🏥 Health Probes: Your Container’s Doctor
The Story of the Sleepy Restaurant
Imagine you own a restaurant. Every morning, you need to know three things:
- Is the chef alive? (Can they breathe and move?)
- Is the chef ready to cook? (Have they warmed up the kitchen?)
- Did the chef just wake up? (Are they still getting ready?)
Kubernetes asks these same questions about your containers. These questions are called Health Probes.
🔍 What Are Health Probes?
Health Probes are like doctors that check on your containers.
They ask: “Are you okay? Can you work?”
If the container says “No,” Kubernetes helps fix the problem.
graph TD A["🐳 Container Running"] --> B{Health Check} B -->|Healthy| C["✅ Keep Running"] B -->|Unhealthy| D["🔄 Take Action"]
💓 Liveness Probes: “Are You Alive?”
The Simple Idea
A Liveness Probe checks if your container is still alive.
Think of it like this:
- A doctor checks if a patient is breathing
- If not breathing → Give CPR (restart)
- If breathing → Everything is fine
What Happens When It Fails?
Kubernetes restarts the container. Like rebooting a frozen computer.
Real Example
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 3
periodSeconds: 5
What this says:
- Check the
/healthzpage every 5 seconds - Wait 3 seconds before the first check
- If the page doesn’t respond, restart me
When Do You Need This?
Your app is stuck in a loop? It crashed but didn’t stop? Liveness Probe catches this.
🚦 Readiness Probes: “Are You Ready to Work?”
The Simple Idea
A Readiness Probe checks if your container can handle requests.
Think of it like this:
- A restaurant opens at 9 AM
- But the chef needs time to prep
- Don’t send customers until food is ready!
What Happens When It Fails?
Kubernetes stops sending traffic to this container. It doesn’t restart—just pauses.
Real Example
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
What this says:
- Check
/readyevery 10 seconds - Wait 5 seconds before first check
- If not ready, don’t send me any work
When Do You Need This?
Your app needs to load data first? Connect to a database? Readiness Probe handles this.
🐣 Startup Probes: “Are You Done Starting?”
The Simple Idea
A Startup Probe gives slow-starting containers extra time.
Think of it like this:
- Some people wake up fast
- Others need 30 minutes and coffee
- Startup Probe is the snooze button
Why Is This Special?
Old apps take forever to start. Without Startup Probes, Liveness checks might kill them too early!
Real Example
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30
periodSeconds: 10
What this says:
- Try 30 times, every 10 seconds
- That’s 5 minutes to start up!
- After startup succeeds, hand off to Liveness Probe
When Do You Need This?
Legacy apps? Big data loads at startup? Startup Probe protects them.
⚙️ Probe Configuration: The Settings
Every probe has settings you can adjust. Like tuning a radio.
The Key Settings
| Setting | What It Does | Example |
|---|---|---|
initialDelaySeconds |
Wait before first check | 10 |
periodSeconds |
Time between checks | 5 |
timeoutSeconds |
How long to wait for response | 3 |
successThreshold |
Passes needed to be “healthy” | 1 |
failureThreshold |
Fails needed to be “unhealthy” | 3 |
Complete Example
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
successThreshold: 1
failureThreshold: 3
Translation:
- Wait 10 seconds, then start checking
- Check every 5 seconds
- Wait max 3 seconds for a reply
- 1 success = healthy
- 3 failures in a row = unhealthy
🔧 Probe Handlers: Three Ways to Check
Kubernetes can check health in three ways. Pick what works for your app.
1. HTTP Handler (Most Common)
Ask a web page: “Are you there?”
httpGet:
path: /healthz
port: 8080
httpHeaders:
- name: Custom-Header
value: Awesome
✅ Best for: Web apps, APIs, anything with HTTP
2. TCP Handler
Just knock on the door: “Is anyone home?”
tcpSocket:
port: 3306
✅ Best for: Databases, message queues, non-HTTP services
3. Exec Handler
Run a command inside the container.
exec:
command:
- cat
- /tmp/healthy
✅ Best for: Custom checks, file-based health, scripts
graph TD A["Pick a Handler"] --> B["HTTP"] A --> C["TCP"] A --> D["Exec"] B --> E["Web Apps & APIs"] C --> F["Databases"] D --> G["Custom Scripts"]
⚔️ Liveness vs Readiness: The Big Difference
This is where people get confused. Let’s make it crystal clear.
Side-by-Side Comparison
| Liveness | Readiness | |
|---|---|---|
| Question | “Are you alive?” | “Can you work?” |
| On Failure | Restart container | Stop sending traffic |
| Use Case | App is stuck/frozen | App is busy/loading |
| Severity | Something is broken | Just not ready yet |
The Restaurant Analogy
Liveness: Is the chef conscious?
- ❌ No → Call an ambulance (restart)
- ✅ Yes → Great!
Readiness: Can the chef take orders?
- ❌ No → Don’t seat customers
- ✅ Yes → Open for business!
When to Use Each
graph TD A["Container Started"] --> B{Startup Probe} B -->|Pass| C["Liveness + Readiness Active"] C --> D{Liveness Check} D -->|Fail| E["🔄 Restart Container"] D -->|Pass| F{Readiness Check} F -->|Fail| G["📵 No Traffic"] F -->|Pass| H["✅ Receive Traffic"]
Golden Rule
Liveness = “Should I restart?” Readiness = “Should I send traffic?”
🎯 Putting It All Together
Here’s a complete example with all three probes:
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: app
image: my-app:1.0
ports:
- containerPort: 8080
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30
periodSeconds: 10
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 0
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 0
periodSeconds: 5
failureThreshold: 3
What happens:
- Startup Probe gives 5 minutes to start
- Once started, Liveness checks every 10 seconds
- Readiness checks every 5 seconds
- Container receives traffic only when ready
🌟 Key Takeaways
- Liveness = Is it alive? (Restart if dead)
- Readiness = Is it ready? (No traffic if busy)
- Startup = Is it done booting? (Extra time for slow apps)
- HTTP/TCP/Exec = Three ways to check health
- Configure wisely = Tune delays and thresholds for your app
You now understand Health Probes! Your containers will thank you for keeping them healthy. 🏥✨
