📊 Data Visualization with Pandas
Turn Numbers Into Pictures That Tell Stories
🎨 The Painting Analogy
Imagine you have a box of crayons. Numbers are boring by themselves—but when you draw pictures with them, suddenly everyone understands!
That’s exactly what data visualization does. Pandas lets you turn rows and columns of numbers into beautiful charts that tell stories at a glance.
🖼️ Think of it this way: If your data is a story, a plot is the picture book version. Even a 5-year-old can understand a picture!
🚀 Plotting with Pandas
The Magic Behind the Curtain
Pandas doesn’t draw charts by itself. It has a secret helper called Matplotlib. When you tell pandas “make me a chart,” pandas whispers to Matplotlib, and Matplotlib does the actual drawing.
The Good News? You don’t need to learn Matplotlib first! Pandas makes it super easy.
Your First Plot in 2 Lines
import pandas as pd
# Some simple data
data = pd.Series([10, 20, 15, 30, 25])
# Draw it!
data.plot()
That’s it! One line to create data, one line to see it.
graph TD A["Your Data"] --> B[".plot"] B --> C["Beautiful Chart!"]
The .plot() Method
Every DataFrame and Series has a .plot() method built in. It’s like every piece of data already knows how to draw itself!
| What You Have | What You Write | What You Get |
|---|---|---|
| Series | series.plot() |
Line chart |
| DataFrame | df.plot() |
Multiple lines |
| Any data | df.plot(kind='bar') |
Bar chart |
📈 Basic Line and Bar Plots
Line Plots: Connect the Dots
Line plots are perfect when you want to show change over time. Like tracking how tall you grow each year!
import pandas as pd
# Monthly sales data
months = ['Jan', 'Feb', 'Mar', 'Apr']
sales = [100, 120, 90, 150]
df = pd.DataFrame({
'Month': months,
'Sales': sales
})
# Line plot
df.plot(x='Month', y='Sales')
When to use line plots:
- 📅 Tracking things over time
- 📈 Showing trends (going up or down)
- 🔗 When your data points connect naturally
Bar Plots: Compare Side by Side
Bar plots are like a race—which bar is tallest wins!
# Same data, but as bars
df.plot(x='Month', y='Sales', kind='bar')
Horizontal bars are great when labels are long:
df.plot(x='Month', y='Sales', kind='barh')
graph TD A["Line Plot"] --> B["Shows CHANGE"] C["Bar Plot"] --> D["Shows COMPARISON"]
Quick Customization
Make your plots prettier with simple options:
df.plot(
x='Month',
y='Sales',
kind='bar',
color='coral',
title='Monthly Sales Report'
)
| Option | What It Does | Example |
|---|---|---|
color |
Changes bar/line color | 'coral', 'blue' |
title |
Adds a title | 'My Chart' |
figsize |
Sets size | (8, 4) |
legend |
Show/hide legend | True or False |
📊 Distribution Plots
What’s a Distribution?
Imagine you asked 100 kids their favorite ice cream flavor. A distribution shows you how answers are spread out—how many said chocolate, vanilla, strawberry, etc.
Histograms: Counting Buckets
A histogram groups your data into “buckets” and counts how many fall into each bucket.
import pandas as pd
import numpy as np
# Test scores of 100 students
scores = pd.Series(np.random.normal(75, 10, 100))
# How are scores distributed?
scores.plot(kind='hist', bins=10)
bins = how many buckets to use
- More bins = more detail
- Fewer bins = simpler picture
graph TD A["Raw Scores"] --> B["Group into Bins"] B --> C["Count Each Bin"] C --> D["Draw Bars"]
Box Plots: The 5-Number Summary
A box plot is like a data X-ray. It shows you:
- 📍 The middle (median)
- 📦 Where most data lives (the box)
- 📏 How spread out things are (the whiskers)
- ⭐ Any outliers (weird values)
df = pd.DataFrame({
'Math': [85, 90, 78, 92, 88, 45, 91],
'Science': [70, 75, 80, 72, 68, 95, 74]
})
df.plot(kind='box')
Pro tip: Box plots are amazing for spotting unusual values (outliers). See that dot way outside the box? Something special happened there!
🔗 Relationship Plots
Scatter Plots: Finding Connections
Do taller people weigh more? Do students who study more get better grades? Scatter plots help you see if two things are connected.
df = pd.DataFrame({
'Study_Hours': [1, 2, 3, 4, 5, 6, 7, 8],
'Test_Score': [50, 55, 60, 65, 70, 80, 85, 90]
})
df.plot(
kind='scatter',
x='Study_Hours',
y='Test_Score'
)
Each dot is one student. See how the dots go up and to the right? That means: more study = better scores!
graph TD A["Dots Going UP ↗"] --> B["Positive Relationship"] C["Dots Going DOWN ↘"] --> D["Negative Relationship"] E["Dots Everywhere"] --> F["No Relationship"]
Reading Scatter Patterns
| Pattern | Meaning | Example |
|---|---|---|
| Dots slope up ↗ | More X = More Y | Study hours vs grades |
| Dots slope down ↘ | More X = Less Y | Screen time vs sleep |
| Random dots | No connection | Shoe size vs grades |
Adding Color to Show Groups
df = pd.DataFrame({
'Height': [150, 160, 170, 155, 165],
'Weight': [50, 60, 70, 55, 65],
'Gender': ['F', 'M', 'M', 'F', 'M']
})
# Color by gender
colors = df['Gender'].map({'F': 'pink', 'M': 'blue'})
df.plot(
kind='scatter',
x='Height',
y='Weight',
c=colors
)
💾 Export Plot to File
Saving Your Masterpiece
What good is a beautiful chart if you can’t share it? Pandas (through Matplotlib) lets you save plots as image files.
The Magic Formula
import matplotlib.pyplot as plt
# Create your plot
df.plot(kind='bar', x='Month', y='Sales')
# Save it!
plt.savefig('my_chart.png')
That’s it! Your chart is now an image file.
File Formats
| Format | Best For | Extension |
|---|---|---|
| PNG | Web, presentations | .png |
| Reports, printing | .pdf |
|
| SVG | Scalable graphics | .svg |
| JPG | Photos (not charts) | .jpg |
# Save as PDF for a report
plt.savefig('sales_report.pdf')
# Save as SVG for a website
plt.savefig('chart.svg')
Pro Tips for Saving
Problem: Chart looks cut off?
plt.savefig('chart.png', bbox_inches='tight')
Problem: Image looks blurry?
plt.savefig('chart.png', dpi=300)
Problem: Want transparent background?
plt.savefig('chart.png', transparent=True)
graph TD A["Create Plot"] --> B["Customize It"] B --> C["plt.savefig"] C --> D["PNG/PDF/SVG"] D --> E["Share Everywhere!"]
Complete Save Example
import pandas as pd
import matplotlib.pyplot as plt
# Create data and plot
df = pd.DataFrame({
'Product': ['A', 'B', 'C'],
'Sales': [100, 150, 80]
})
df.plot(
x='Product',
y='Sales',
kind='bar',
title='Product Sales',
color='teal'
)
# Save with all the best settings
plt.savefig(
'product_sales.png',
dpi=300,
bbox_inches='tight'
)
🎯 Quick Reference
| Task | Code |
|---|---|
| Line plot | df.plot() |
| Bar plot | df.plot(kind='bar') |
| Horizontal bar | df.plot(kind='barh') |
| Histogram | df.plot(kind='hist') |
| Box plot | df.plot(kind='box') |
| Scatter plot | df.plot(kind='scatter', x='col1', y='col2') |
| Save plot | plt.savefig('filename.png') |
🌟 You Did It!
You now know how to:
- ✅ Turn boring numbers into beautiful charts
- ✅ Choose the right plot type for your story
- ✅ Find patterns and relationships in data
- ✅ Save and share your visualizations
Remember: Every great data scientist started exactly where you are now. Keep practicing, keep plotting, and soon you’ll be creating visualizations that make people say “Wow!”
🎨 “A picture is worth a thousand numbers.”
