🛡️ Kubernetes Backup & Recovery: Your Cluster’s Safety Net
The Story of the Sandcastle
Imagine you spent all day building the most amazing sandcastle on the beach. Towers, moats, bridges—everything! 🏰
Then a wave comes and… whoosh… it’s gone.
Now imagine if you had a magic camera that could:
- Take a perfect picture of your sandcastle
- Rebuild it exactly the same way whenever you want
That’s what Velero does for your Kubernetes cluster!
🌊 What is Backup and Disaster Recovery?
The Simple Truth
Backup = Taking a “snapshot” of everything important Recovery = Using that snapshot to bring everything back
Think of it like:
- 📸 Backup = Taking photos of your toys
- 🔄 Recovery = Using photos to find and rebuild lost toys
Why Do We Need This?
Bad things happen. Even to computers:
graph TD A["Your Cluster"] --> B{Disaster Strikes!} B --> C["💥 Accidental Delete"] B --> D["🔥 Hardware Failure"] B --> E["🐛 Software Bug"] B --> F["👾 Security Attack"] C --> G["😱 Data Lost Forever?"] D --> G E --> G F --> G G --> H{Do you have backup?} H -->|Yes ✅| I["😊 Restore & Relax"] H -->|No ❌| J["😭 Start Over"]
Without backup: Start from scratch With backup: Press a button, everything returns!
🦸 Meet Velero: Your Backup Superhero
What is Velero?
Velero (once called “Heptio Ark”) is like a superhero for your Kubernetes cluster.
Its superpower? Making perfect copies of:
- ✅ All your applications
- ✅ All your configurations
- ✅ All your data (volumes)
Then storing them safely somewhere else!
The Velero Analogy: The Librarian 📚
Think of Velero as a super-organized librarian:
| What the Librarian Does | What Velero Does |
|---|---|
| Makes copies of books | Backs up resources |
| Stores copies safely | Sends to cloud storage |
| Finds any book fast | Restores what you need |
| Keeps everything organized | Labels and schedules |
Why Velero is Special
graph TD A["Velero"] --> B["📦 Backs Up Everything"] A --> C["☁️ Stores Anywhere"] A --> D["⏰ Runs Automatically"] A --> E["🎯 Restores Precisely"] B --> F["Deployments"] B --> G["Services"] B --> H["ConfigMaps"] B --> I["Persistent Volumes"] C --> J["AWS S3"] C --> K["Google Cloud"] C --> L["Azure Blob"] C --> M["MinIO"]
🔧 Velero Backup Configuration
The Three Magic Ingredients
To make Velero work, you need three things:
- Velero itself (the superhero)
- A storage place (where to keep backups)
- Backup rules (when and what to backup)
Step 1: Installing Velero
First, we tell Kubernetes to bring in our superhero:
# velero-install.yaml
apiVersion: v1
kind: Namespace
metadata:
name: velero
---
# Velero server runs here
# watching and protecting
Think of it like: Hiring a security guard 👮
Step 2: Setting Up Storage (BackupStorageLocation)
Where should Velero store your backups? You need a safe place!
apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
name: default
namespace: velero
spec:
provider: aws
objectStorage:
bucket: my-backup-bucket
config:
region: us-east-1
Breaking it down:
| Part | Meaning | Analogy |
|---|---|---|
provider: aws |
Using Amazon’s storage | Which bank? |
bucket: my-backup-bucket |
The storage container | Your safe deposit box |
region: us-east-1 |
Where it lives | Which branch? |
Step 3: Creating a Backup
Now the magic happens! Tell Velero what to save:
apiVersion: velero.io/v1
kind: Backup
metadata:
name: my-first-backup
namespace: velero
spec:
includedNamespaces:
- production
- staging
excludedResources:
- secrets
ttl: 720h
What each part means:
| Setting | What it Does |
|---|---|
includedNamespaces |
Which areas to backup |
excludedResources |
What to skip |
ttl: 720h |
Keep backup for 30 days |
Step 4: Schedule Automatic Backups
Don’t want to remember? Make it automatic!
apiVersion: velero.io/v1
kind: Schedule
metadata:
name: daily-backup
namespace: velero
spec:
schedule: "0 2 * * *"
template:
includedNamespaces:
- production
ttl: 168h
The schedule "0 2 * * *" means:
- Every day at 2:00 AM
- Like setting an alarm clock! ⏰
graph LR A["2 AM Every Day"] --> B["Velero Wakes Up"] B --> C["Takes Backup"] C --> D["Sends to Storage"] D --> E["Goes Back to Sleep"]
🔄 Restoring: Bringing Things Back
When Disaster Strikes
Oh no! Someone deleted the production namespace! 😱
Don’t panic. Here’s how to restore:
apiVersion: velero.io/v1
kind: Restore
metadata:
name: restore-production
namespace: velero
spec:
backupName: daily-backup-20231215
includedNamespaces:
- production
That’s it! Velero will:
- Find your backup
- Read what was there
- Recreate everything exactly
Restore Options
You have choices:
| Option | Use When |
|---|---|
| Full restore | Everything is gone |
| Partial restore | Only some things missing |
| Restore to new namespace | Testing recovery |
💾 Volume Snapshots
What About My Data?
Applications often store data on disks (Persistent Volumes).
Velero can backup these too!
apiVersion: velero.io/v1
kind: VolumeSnapshotLocation
metadata:
name: default
namespace: velero
spec:
provider: aws
config:
region: us-east-1
Think of this like:
- Regular backup = Copying your homework 📝
- Volume snapshot = Copying your whole notebook 📓
🎯 Best Practices: The Golden Rules
Rule 1: Test Your Backups! 🧪
A backup you’ve never tested is just hope.
Regularly:
- Create a test namespace
- Restore your backup there
- Check everything works
Rule 2: Follow the 3-2-1 Rule
graph TD A["3-2-1 Rule"] --> B["3 Copies of Data"] A --> C["2 Different Storage Types"] A --> D["1 Copy Offsite"] B --> E["Original + 2 backups"] C --> F["Disk + Cloud"] D --> G["Different location"]
Rule 3: Label Everything
Use labels to organize backups:
metadata:
labels:
environment: production
app: my-app
backup-type: daily
Rule 4: Monitor Your Backups
Set up alerts for:
- ❌ Backup failures
- ⚠️ Storage running low
- 📅 Backups too old
🎬 Real-World Example: The E-commerce Site
Let’s see Velero in action!
The Setup
my-shop-cluster/
├── frontend/ (React app)
├── backend/ (API server)
├── database/ (PostgreSQL)
└── redis/ (Cache)
The Backup Strategy
# Complete daily backup
apiVersion: velero.io/v1
kind: Schedule
metadata:
name: full-daily-backup
spec:
schedule: "0 3 * * *"
template:
includedNamespaces:
- frontend
- backend
- database
- redis
snapshotVolumes: true
ttl: 720h
The Disaster
Tuesday morning: Database namespace accidentally deleted! 😱
The Recovery
apiVersion: velero.io/v1
kind: Restore
metadata:
name: restore-database
spec:
backupName: full-daily-backup-20231219
includedNamespaces:
- database
restorePVs: true
Result: Database back in 5 minutes! 🎉
📝 Quick Reference: Velero Commands
| Task | Command |
|---|---|
| Install Velero | velero install |
| Create backup | velero backup create NAME |
| List backups | velero backup get |
| Restore backup | velero restore create --from-backup NAME |
| Check backup status | velero backup describe NAME |
| Delete old backups | velero backup delete NAME |
🌟 Your Backup Journey Begins!
You now know:
✅ Why backup matters (waves and sandcastles!) ✅ What Velero does (your backup superhero) ✅ How to configure it (storage, backups, schedules) ✅ When to restore (disasters happen, you’re ready!)
Remember
“The best time to set up backups was yesterday. The second best time is right now.” 🚀
Your Kubernetes cluster is precious. Protect it with Velero!
Next: Try the Interactive Lab to practice creating backups yourself! 🎮
