🎯 SQL Analytical Functions: Your Data Superpower
Imagine you’re a detective looking at a huge list of clues. You don’t just want to see the clues—you want to compare them, rank them, and find patterns. That’s exactly what SQL Analytical Functions do!
🏆 What Are Analytical Functions?
Think of a classroom of students with their test scores. Normal SQL can tell you “the average score.” But what if you want to:
- Rank each student from 1st to last?
- Compare each student’s score to the one before them?
- Group students by their grade level and rank within each group?
Analytical Functions let you do all of this—without losing any rows!
graph TD A["📊 Your Data"] --> B["🔧 Analytical Function"] B --> C["✨ Same Rows + New Info"] C --> D["Rankings, Comparisons, Groups"]
🥇 SQL Ranking Functions
The Big Three: ROW_NUMBER, RANK, DENSE_RANK
Imagine a race with these finish times:
| Runner | Time |
|---|---|
| Alice | 10s |
| Bob | 12s |
| Carol | 12s |
| Dave | 15s |
How do we rank them?
1. ROW_NUMBER() — “Everyone Gets a Unique Number”
Like giving out numbered tickets. No sharing!
SELECT name, time,
ROW_NUMBER() OVER (
ORDER BY time
) AS ticket
FROM runners;
Result:
| name | time | ticket |
|---|---|---|
| Alice | 10s | 1 |
| Bob | 12s | 2 |
| Carol | 12s | 3 |
| Dave | 15s | 4 |
🎫 Bob and Carol have the same time, but get different numbers!
2. RANK() — “Ties Share, Then Skip”
Like an Olympic podium. Ties get the same medal, but the next person skips ahead.
SELECT name, time,
RANK() OVER (
ORDER BY time
) AS medal
FROM runners;
Result:
| name | time | medal |
|---|---|---|
| Alice | 10s | 1 |
| Bob | 12s | 2 |
| Carol | 12s | 2 |
| Dave | 15s | 4 |
🏅 Bob and Carol both get 2nd! Dave is 4th (not 3rd).
3. DENSE_RANK() — “Ties Share, No Skipping”
Like a video game leaderboard. Ties share, but the next person is right after.
SELECT name, time,
DENSE_RANK() OVER (
ORDER BY time
) AS level
FROM runners;
Result:
| name | time | level |
|---|---|---|
| Alice | 10s | 1 |
| Bob | 12s | 2 |
| Carol | 12s | 2 |
| Dave | 15s | 3 |
🎮 Dave is 3rd because we never skip numbers!
🧠 Quick Comparison
graph TD A["Same Score: 12s"] --> B["ROW_NUMBER"] A --> C["RANK"] A --> D["DENSE_RANK"] B --> E["2, 3 ➡️ Different"] C --> F["2, 2 ➡️ Skip to 4"] D --> G["2, 2 ➡️ Next is 3"]
🔀 SQL Offset Functions
LAG & LEAD: Time Travel in Your Data
Imagine watching your piggy bank grow each week:
| Week | Savings |
|---|---|
| 1 | $10 |
| 2 | $25 |
| 3 | $40 |
| 4 | $55 |
What if you want to see last week’s savings next to this week’s?
LAG() — “Look Behind”
Like checking your rearview mirror while driving.
SELECT week, savings,
LAG(savings, 1) OVER (
ORDER BY week
) AS last_week
FROM piggy_bank;
Result:
| week | savings | last_week |
|---|---|---|
| 1 | $10 | NULL |
| 2 | $25 | $10 |
| 3 | $40 | $25 |
| 4 | $55 | $40 |
🪞 Week 1 has NULL because there’s no “before” yet!
LEAD() — “Look Ahead”
Like peeking at the next page of a book.
SELECT week, savings,
LEAD(savings, 1) OVER (
ORDER BY week
) AS next_week
FROM piggy_bank;
Result:
| week | savings | next_week |
|---|---|---|
| 1 | $10 | $25 |
| 2 | $25 | $40 |
| 3 | $40 | $55 |
| 4 | $55 | NULL |
🔮 Week 4 has NULL because there’s no “after” yet!
📈 Calculate Growth!
Now you can find how much you saved each week:
SELECT week, savings,
savings - LAG(savings) OVER (
ORDER BY week
) AS added
FROM piggy_bank;
| week | savings | added |
|---|---|---|
| 1 | $10 | NULL |
| 2 | $25 | $15 |
| 3 | $40 | $15 |
| 4 | $55 | $15 |
💰 You added $15 each week! Pattern discovered!
📦 PARTITION BY: Grouping Magic
What’s PARTITION BY?
Think of a school with multiple classrooms. You want to rank students—but within their own class, not the whole school.
PARTITION BY creates invisible walls between groups!
Example: Top Student Per Class
SELECT class, student, score,
RANK() OVER (
PARTITION BY class
ORDER BY score DESC
) AS class_rank
FROM students;
Input:
| class | student | score |
|---|---|---|
| A | Mia | 95 |
| A | Jake | 88 |
| B | Emma | 92 |
| B | Leo | 92 |
| B | Zoe | 85 |
Result:
| class | student | score | class_rank |
|---|---|---|---|
| A | Mia | 95 | 1 |
| A | Jake | 88 | 2 |
| B | Emma | 92 | 1 |
| B | Leo | 92 | 1 |
| B | Zoe | 85 | 3 |
🏫 Each class has its own #1!
PARTITION BY with LAG
Compare each student to the previous one in their class:
SELECT class, student, score,
LAG(score) OVER (
PARTITION BY class
ORDER BY score DESC
) AS prev_score
FROM students;
🔑 The comparison only happens within each class!
graph TD A["All Data"] --> B["PARTITION BY class"] B --> C["Class A"] B --> D["Class B"] C --> E["Rank within A"] D --> F["Rank within B"]
🔀 CASE Statements: The Decision Maker
What’s a CASE Statement?
Think of a sorting hat from Harry Potter. It looks at something and decides: “You go here, you go there!”
CASE
WHEN condition THEN result
WHEN another THEN another_result
ELSE default
END
Example: Grading System
Turn scores into letter grades:
SELECT student, score,
CASE
WHEN score >= 90 THEN 'A'
WHEN score >= 80 THEN 'B'
WHEN score >= 70 THEN 'C'
ELSE 'F'
END AS grade
FROM students;
| student | score | grade |
|---|---|---|
| Mia | 95 | A |
| Jake | 88 | B |
| Zoe | 65 | F |
🎓 SQL looks at each condition top-to-bottom!
CASE + Analytical Functions = 🔥
Rank students, but only show “Top 3” or “Others”:
SELECT student, score,
CASE
WHEN RANK() OVER (
ORDER BY score DESC
) <= 3 THEN '🏆 Top 3'
ELSE 'Keep Trying!'
END AS status
FROM students;
🌟 Combine CASE with rankings for powerful results!
Conditional Counting
Count how many passed vs failed:
SELECT
SUM(CASE
WHEN score >= 70 THEN 1
ELSE 0
END) AS passed,
SUM(CASE
WHEN score < 70 THEN 1
ELSE 0
END) AS failed
FROM students;
📊 CASE turns conditions into numbers you can count!
🎯 Putting It All Together
Real-World Example: Sales Dashboard
Find each salesperson’s rank in their region, compare to their last sale, and categorize performance:
SELECT
region,
salesperson,
sale_amount,
RANK() OVER (
PARTITION BY region
ORDER BY sale_amount DESC
) AS region_rank,
LAG(sale_amount) OVER (
PARTITION BY salesperson
ORDER BY sale_date
) AS previous_sale,
CASE
WHEN sale_amount > 1000 THEN '⭐ Star'
WHEN sale_amount > 500 THEN '👍 Good'
ELSE '📈 Growing'
END AS performance
FROM sales;
🧭 The OVER() Clause Pattern
Every analytical function follows this pattern:
FUNCTION() OVER (
PARTITION BY column -- Optional
ORDER BY column -- Usually needed
)
graph TD A["FUNCTION"] --> B["OVER#40;#41;"] B --> C["PARTITION BY"] B --> D["ORDER BY"] C --> E["Divides into groups"] D --> F["Sorts within groups"]
🚀 You Did It!
You now understand:
| Function | What It Does |
|---|---|
| ROW_NUMBER | Unique numbers for all |
| RANK | Ties share, then skip |
| DENSE_RANK | Ties share, no skip |
| LAG | Look at previous row |
| LEAD | Look at next row |
| PARTITION BY | Create groups |
| CASE | Make decisions |
💪 You’re now ready to analyze data like a pro! These functions turn raw data into insights.
Remember: Start simple, combine pieces, and you’ll be amazed at what you can discover! 🌟
