🎯 SQL: The Art of Counting & Grouping
Imagine you have a giant toy box. How do you know how many toys you have? How do you find which color has the most toys? That’s what SQL aggregation is all about!
🍪 The Cookie Jar Analogy
Think of a database like a magical cookie jar that can answer questions:
- “How many cookies do I have?” → COUNT
- “How many chocolate chips total?” → SUM
- “What’s the average chips per cookie?” → AVG
- “Which cookie has the most chips?” → MAX
- “Which has the least?” → MIN
And GROUP BY is like sorting cookies by flavor before counting!
📊 Aggregate Functions: Your Counting Helpers
COUNT: The Head Counter 🔢
COUNT is like a teacher taking attendance. It counts rows.
SELECT COUNT(*)
FROM students;
This counts ALL students. Simple!
Two flavors:
COUNT(*)→ Counts everything, even empty seatsCOUNT(column)→ Only counts seats with students (ignores NULL)
SELECT COUNT(email)
FROM users;
Only counts users who have an email!
SUM & AVG: The Math Wizards ➕
SUM adds up all the numbers:
SELECT SUM(price)
FROM orders;
“What’s the total of all orders?”
AVG finds the middle ground:
SELECT AVG(age)
FROM employees;
“What’s the average age at our company?”
💡 Remember: SUM and AVG only work with numbers!
MIN & MAX: The Extreme Seekers 📉📈
MIN finds the smallest:
SELECT MIN(price)
FROM products;
“What’s our cheapest product?”
MAX finds the biggest:
SELECT MAX(score)
FROM games;
“What’s the highest score ever?”
🗂️ GROUP BY: Sorting Before Counting
Here’s where magic happens! What if you want to count cookies by flavor?
graph TD A[All Cookies] --> B[GROUP BY flavor] B --> C[Chocolate: 5] B --> D[Vanilla: 3] B --> E[Strawberry: 4]
SELECT flavor, COUNT(*)
FROM cookies
GROUP BY flavor;
Result:
| flavor | count |
|---|---|
| Chocolate | 5 |
| Vanilla | 3 |
| Strawberry | 4 |
GROUP BY Multiple Columns: Double Sorting
What about counting by flavor AND size?
SELECT flavor, size, COUNT(*)
FROM cookies
GROUP BY flavor, size;
Result:
| flavor | size | count |
|---|---|---|
| Chocolate | Small | 2 |
| Chocolate | Large | 3 |
| Vanilla | Small | 3 |
It’s like organizing toys by color, then by shape!
🚦 HAVING: The Group Bouncer
Problem: You only want groups that meet a condition.
Wrong approach:
SELECT flavor, COUNT(*)
FROM cookies
WHERE COUNT(*) > 3 -- ❌ ERROR!
GROUP BY flavor;
Right approach:
SELECT flavor, COUNT(*)
FROM cookies
GROUP BY flavor
HAVING COUNT(*) > 3; -- ✅ Works!
HAVING checks groups AFTER they’re made.
🆚 HAVING vs WHERE: The Big Difference
graph TD A[All Data] --> B[WHERE filters rows] B --> C[GROUP BY makes groups] C --> D[HAVING filters groups] D --> E[Final Result]
| WHERE | HAVING |
|---|---|
| Filters rows | Filters groups |
| Works BEFORE grouping | Works AFTER grouping |
| Can’t use COUNT, SUM | Can use COUNT, SUM |
Example combining both:
SELECT category, AVG(price)
FROM products
WHERE stock > 0
GROUP BY category
HAVING AVG(price) > 50;
First, only products in stock. Then group. Then only expensive categories.
🔄 Query Processing Order
SQL doesn’t run top-to-bottom. Here’s the real order:
graph TD A[1. FROM] --> B[2. WHERE] B --> C[3. GROUP BY] C --> D[4. HAVING] D --> E[5. SELECT] E --> F[6. ORDER BY]
Memory trick: “From Where Groups Having Selected Order”
SELECT category, SUM(sales) -- 5th
FROM orders -- 1st
WHERE year = 2024 -- 2nd
GROUP BY category -- 3rd
HAVING SUM(sales) > 1000 -- 4th
ORDER BY SUM(sales); -- 6th
🎲 ROLLUP: Subtotals Made Easy
Want totals AND subtotals? ROLLUP is your friend!
SELECT region, city, SUM(sales)
FROM stores
GROUP BY ROLLUP(region, city);
Result:
| region | city | sum |
|---|---|---|
| North | NYC | 100 |
| North | Boston | 80 |
| North | NULL | 180 |
| South | Miami | 120 |
| South | NULL | 120 |
| NULL | NULL | 300 |
NULL rows = subtotals and grand total!
🧊 CUBE: All Possible Subtotals
CUBE is like ROLLUP on steroids. It gives EVERY combination!
SELECT region, product, SUM(sales)
FROM stores
GROUP BY CUBE(region, product);
You get:
- Sales by region + product
- Sales by region alone
- Sales by product alone
- Grand total
graph TD A[CUBE] --> B[By Region + Product] A --> C[By Region Only] A --> D[By Product Only] A --> E[Grand Total]
🎯 Quick Summary
| Function | Does What | Example |
|---|---|---|
| COUNT | Counts rows | COUNT(*) |
| SUM | Adds numbers | SUM(price) |
| AVG | Average | AVG(score) |
| MIN | Smallest | MIN(age) |
| MAX | Largest | MAX(salary) |
| GROUP BY | Makes groups | GROUP BY city |
| HAVING | Filters groups | HAVING COUNT > 5 |
| ROLLUP | Subtotals | ROLLUP(a, b) |
| CUBE | All combos | CUBE(a, b) |
🚀 You Did It!
You now understand how to:
- ✅ Count, sum, and average data
- ✅ Find minimums and maximums
- ✅ Group data in powerful ways
- ✅ Filter groups with HAVING
- ✅ Know the real query order
- ✅ Create reports with ROLLUP and CUBE
You’re not just reading data anymore—you’re summarizing it like a pro! 🎉