📊 Matplotlib Histograms: The Story of Data Buckets
🎬 The Big Picture
Imagine you’re sorting a giant pile of colorful LEGO bricks by color. You grab one bucket for red, one for blue, one for yellow… and you toss each brick into the right bucket. When you’re done, you can SEE which colors you have the most of just by looking at how FULL each bucket is!
That’s exactly what a histogram does with numbers!
A histogram takes your data, sorts it into “buckets” (called bins), and shows you how many items landed in each bucket. It’s like magic glasses that let you SEE your data’s story!
🪣 Histogram Basics
What Is a Histogram?
Think of a histogram like a game where you’re:
- Drawing lines on the ground (these are your bins)
- Tossing balls (your data) into the spaces between lines
- Stacking them up to see which space got the most balls
import matplotlib.pyplot as plt
import numpy as np
# Let's say these are test scores
scores = [65, 72, 78, 82, 85, 88, 90, 92, 95, 98]
# Create the histogram!
plt.hist(scores)
plt.xlabel('Scores')
plt.ylabel('How Many Students')
plt.title('Test Scores Distribution')
plt.show()
The Magic Function: plt.hist()
The hist() function needs just ONE thing: your data!
plt.hist(data)
That’s it! Matplotlib does the rest:
- Figures out the range of your numbers
- Creates 10 buckets (bins) by default
- Counts how many values go in each bucket
- Draws the bars for you!
Understanding Bins
Bins = Buckets for your numbers
Imagine you have ages: 5, 7, 12, 15, 22, 25, 28, 35, 42, 55
With 3 bins, you might get:
- Bucket 1 (0-20): 4 people 📦📦📦📦
- Bucket 2 (20-40): 4 people 📦📦📦📦
- Bucket 3 (40-60): 2 people 📦📦
With 6 bins, you see MORE detail!
ages = [5, 7, 12, 15, 22, 25, 28, 35, 42, 55]
# Try different bin counts
plt.hist(ages, bins=3)
plt.title('Ages with 3 Bins')
plt.show()
🎨 Histogram Customization
Now let’s make our histograms BEAUTIFUL! Like decorating your room 🎪
Changing Colors
data = np.random.randn(1000)
# Pick your favorite color!
plt.hist(data, color='coral')
plt.show()
Popular colors: 'coral', 'skyblue', 'lightgreen', 'gold', 'violet'
Adding Edges (Outlines)
Without edges, bars can blur together. Add outlines!
plt.hist(data,
color='lightblue',
edgecolor='navy')
plt.show()
Controlling Bins
You can tell matplotlib EXACTLY how many bins you want:
# I want exactly 20 bins!
plt.hist(data, bins=20)
# Or specify the exact edges
plt.hist(data, bins=[-3, -2, -1, 0, 1, 2, 3])
Transparency (Alpha)
Make bars see-through! This helps when comparing multiple datasets.
plt.hist(data, alpha=0.7) # 0 = invisible, 1 = solid
All Together Now!
data = np.random.randn(1000)
plt.hist(data,
bins=25,
color='mediumseagreen',
edgecolor='darkgreen',
alpha=0.8,
linewidth=1.2)
plt.xlabel('Values')
plt.ylabel('Frequency')
plt.title('My Beautiful Histogram!')
plt.show()
📊 Multiple Histograms
What if you want to compare TWO groups? Like comparing test scores from Class A vs Class B?
Side by Side
class_a = np.random.normal(75, 10, 100)
class_b = np.random.normal(80, 8, 100)
plt.hist(class_a, label='Class A', alpha=0.5)
plt.hist(class_b, label='Class B', alpha=0.5)
plt.legend()
plt.show()
🎯 Pro Tip: Use alpha (transparency) so you can see where they overlap!
Stacked Histograms
Stack one on top of the other:
plt.hist([class_a, class_b],
stacked=True,
label=['Class A', 'Class B'],
color=['skyblue', 'salmon'])
plt.legend()
plt.show()
Same Bins for Fair Comparison
When comparing, use the SAME bins for both:
# Define bins once
my_bins = np.linspace(50, 100, 20)
plt.hist(class_a, bins=my_bins, alpha=0.6)
plt.hist(class_b, bins=my_bins, alpha=0.6)
plt.show()
This makes the comparison FAIR! 🎯
Multiple Histograms in Subplots
Sometimes overlapping is messy. Put them side by side!
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))
ax1.hist(class_a, color='coral')
ax1.set_title('Class A Scores')
ax2.hist(class_b, color='teal')
ax2.set_title('Class B Scores')
plt.tight_layout()
plt.show()
🗺️ 2D Histogram (Heatmap Style!)
Now here’s where it gets COOL! 🌟
Regular histograms work with ONE list of numbers. But what if you have TWO related measurements?
Example: Height AND Weight of people
A 2D Histogram creates a GRID and colors each cell based on how many data points fall there. It’s like a treasure map showing where your data clusters!
Creating a 2D Histogram
# Height and weight data
height = np.random.normal(170, 10, 1000)
weight = np.random.normal(70, 15, 1000)
plt.hist2d(height, weight, bins=20)
plt.colorbar(label='Count')
plt.xlabel('Height (cm)')
plt.ylabel('Weight (kg)')
plt.title('Height vs Weight Distribution')
plt.show()
Understanding the Colors
- Bright/Hot colors = LOTS of data points there! 🔥
- Dark/Cool colors = Few data points 🧊
The colorbar on the side tells you what each color means!
Customizing 2D Histograms
plt.hist2d(height, weight,
bins=30,
cmap='plasma') # Try: viridis, hot, cool
plt.colorbar()
plt.show()
Fun colormaps to try:
'viridis'- Purple to yellow (default, beautiful!)'hot'- Black to red to yellow to white'cool'- Cyan to magenta'plasma'- Purple to orange to yellow
Different Bin Counts for Each Axis
You can have different detail levels for X and Y:
plt.hist2d(height, weight, bins=[20, 30])
# 20 bins for height, 30 bins for weight
🧭 Quick Mental Map
graph TD A["Your Data"] --> B{How many variables?} B -->|One variable| C["Regular Histogram<br>plt.hist"] B -->|Two variables| D["2D Histogram<br>plt.hist2d"] C --> E["Customize!<br>bins, color, alpha"] C --> F["Compare Multiple<br>overlay or subplots"] D --> G["Customize!<br>bins, cmap, colorbar"]
🎯 The Golden Rules
- Start Simple:
plt.hist(data)- see what happens first! - Adjust Bins: Too few = lose detail. Too many = too noisy.
- Use Alpha: When comparing multiple histograms
- Same Bins: For fair comparisons between groups
- 2D for Pairs: When you have two related measurements
🚀 You Did It!
You now know how to:
- ✅ Create basic histograms with
plt.hist() - ✅ Customize colors, edges, bins, and transparency
- ✅ Compare multiple datasets with overlapping or stacked histograms
- ✅ Create 2D histograms to see patterns in paired data
Remember: Histograms are your data’s autobiography. They show the story of where your numbers like to hang out! 📖✨
