Stream Processing: Collectors and Streams 🌊
The Water Park Adventure
Imagine you’re the manager of a magical water park! Water flows through pipes, and you have special machines that can:
- Collect the water into different containers
- Group swimmers by their favorite rides
- Count how many people visited each attraction
- Speed up everything with parallel water slides
- And work with special number-only streams!
That’s exactly what Java Streams with Collectors do. Let’s dive in!
1. Basic Collectors: Your Collection Machines 📦
Think of Collectors as special buckets at the end of a water slide. The water (data) slides down, and the bucket catches it in exactly the way you want.
What Are Basic Collectors?
Basic Collectors take your stream elements and put them into familiar containers like Lists, Sets, or even join them into a single String!
// Our swimmers for today
List<String> swimmers = List.of(
"Emma", "Liam", "Emma", "Noah"
);
// Collect into a List (keeps order, allows duplicates)
List<String> list = swimmers.stream()
.collect(Collectors.toList());
// Result: [Emma, Liam, Emma, Noah]
// Collect into a Set (no duplicates!)
Set<String> set = swimmers.stream()
.collect(Collectors.toSet());
// Result: [Emma, Liam, Noah]
// Join into one String
String names = swimmers.stream()
.collect(Collectors.joining(", "));
// Result: "Emma, Liam, Emma, Noah"
Simple Example: The Ice Cream Counter 🍦
List<String> orders = List.of(
"vanilla", "chocolate", "vanilla"
);
// Count all orders
long total = orders.stream()
.collect(Collectors.counting());
// Result: 3
graph TD A["Stream of Items"] --> B{Collector} B --> C["toList - Ordered List"] B --> D["toSet - Unique Items"] B --> E["joining - Single String"] B --> F["counting - Just a Number"]
Key Point
Collectors are the “catchers” at the end of your stream. They decide how your data gets packaged up!
2. Collectors for Grouping: Sorting Hats! 🎩
Remember the Sorting Hat from Harry Potter? Grouping collectors work the same way—they look at each item and put it in the right group!
What Is Grouping?
Grouping takes a stream and organizes items into a Map where:
- The key = the category
- The value = list of items in that category
// Swimmers with their favorite ride
record Swimmer(String name, String ride) {}
List<Swimmer> guests = List.of(
new Swimmer("Emma", "WavePool"),
new Swimmer("Liam", "Slide"),
new Swimmer("Noah", "WavePool"),
new Swimmer("Olivia", "Slide")
);
// Group by favorite ride
Map<String, List<Swimmer>> byRide =
guests.stream()
.collect(Collectors.groupingBy(
Swimmer::ride
));
// Result:
// WavePool → [Emma, Noah]
// Slide → [Liam, Olivia]
Partitioning: Just Two Groups!
Sometimes you just need true or false groups. That’s partitioning!
List<Integer> ages = List.of(5, 12, 8, 15, 7);
// Kids under 10 vs 10 and older
Map<Boolean, List<Integer>> groups =
ages.stream()
.collect(Collectors.partitioningBy(
age -> age < 10
));
// true → [5, 8, 7] (under 10)
// false → [12, 15] (10 and older)
graph TD A["Stream of Swimmers"] --> B["groupingBy ride"] B --> C["WavePool Group"] B --> D["Slide Group"] B --> E["Lazy River Group"] F["Stream of Ages"] --> G["partitioningBy age < 10"] G --> H["true: Young Kids"] G --> I["false: Older Kids"]
Key Point
groupingBy= many groups by any categorypartitioningBy= exactly 2 groups (true/false)
3. Collectors for Aggregation: The Math Wizards 🧙‍♂️
Aggregation means combining many values into one summary number—like adding up all the coins in your piggy bank!
Common Aggregation Collectors
List<Integer> scores = List.of(85, 92, 78, 96, 88);
// Sum all scores
int total = scores.stream()
.collect(Collectors.summingInt(s -> s));
// Result: 439
// Average score
double avg = scores.stream()
.collect(Collectors.averagingInt(s -> s));
// Result: 87.8
// Find the highest score
Optional<Integer> max = scores.stream()
.collect(Collectors.maxBy(
Integer::compareTo
));
// Result: 96
// Get ALL stats at once!
IntSummaryStatistics stats = scores.stream()
.collect(Collectors.summarizingInt(s -> s));
// stats.getSum() → 439
// stats.getAverage() → 87.8
// stats.getMax() → 96
// stats.getMin() → 78
// stats.getCount() → 5
Downstream Collectors: Combo Moves! 🎮
You can combine grouping WITH aggregation!
// Count swimmers per ride
Map<String, Long> rideCounts =
guests.stream()
.collect(Collectors.groupingBy(
Swimmer::ride,
Collectors.counting()
));
// WavePool → 2
// Slide → 2
graph TD A["Stream of Numbers"] --> B{Aggregation} B --> C["summingInt → Total"] B --> D["averagingInt → Average"] B --> E["maxBy → Largest"] B --> F["minBy → Smallest"] B --> G["summarizingInt → All Stats"]
Key Point
Aggregation collectors crunch numbers: sums, averages, min, max—all in one smooth flow!
4. Parallel Streams: The Speed Boost! 🚀
Imagine instead of ONE water slide, you have TEN slides running at the same time! That’s parallel streams—doing work simultaneously to finish faster.
When To Use Parallel Streams
List<Integer> bigList = IntStream.rangeClosed(1, 1000000)
.boxed()
.toList();
// Regular stream (one lane)
long sum1 = bigList.stream()
.mapToLong(n -> n)
.sum();
// Parallel stream (many lanes!)
long sum2 = bigList.parallelStream()
.mapToLong(n -> n)
.sum();
// Same result, but faster for big data!
Converting Between Sequential and Parallel
// Start parallel
List<String> result = names.parallelStream()
.filter(n -> n.length() > 3)
.sequential() // Switch back to single lane
.collect(Collectors.toList());
// Start sequential, go parallel
List<String> result2 = names.stream()
.parallel() // Now using multiple lanes!
.map(String::toUpperCase)
.collect(Collectors.toList());
⚠️ Warning: Parallel Isn’t Always Faster!
| Good for Parallel | Bad for Parallel |
|---|---|
| Huge datasets (millions) | Small lists |
| Independent operations | Operations that depend on order |
| CPU-heavy work | Simple operations |
graph TD A["Big Data Stream"] --> B{Choose Mode} B --> C["stream - One Worker"] B --> D["parallelStream - Many Workers"] D --> E["Worker 1"] D --> F["Worker 2"] D --> G["Worker 3"] E --> H["Combine Results"] F --> H G --> H
Key Point
Parallel streams split work across multiple CPU cores. Great for big data, but has overhead—test before using!
5. Primitive Streams: The Number Specialists 🔢
Regular streams wrap numbers in boxes (Integer, Double, Long). Primitive streams work directly with raw numbers—no boxes, no wasted memory!
The Three Primitive Stream Types
| Type | For Numbers Like | Example Values |
|---|---|---|
IntStream |
int | 1, 42, -7 |
LongStream |
long | 1L, 9999999999L |
DoubleStream |
double | 3.14, 2.718 |
Creating Primitive Streams
// IntStream examples
IntStream numbers = IntStream.of(1, 2, 3, 4, 5);
IntStream range = IntStream.range(1, 10); // 1 to 9
IntStream closed = IntStream.rangeClosed(1, 10); // 1 to 10
// From a regular stream
IntStream ages = people.stream()
.mapToInt(Person::age);
// Generate infinite stream
IntStream ones = IntStream.generate(() -> 1);
IntStream counting = IntStream.iterate(0, n -> n + 1);
Special Primitive Methods
int[] scores = {85, 92, 78, 96, 88};
IntStream stream = Arrays.stream(scores);
// Direct aggregation - no collectors needed!
int sum = IntStream.of(1, 2, 3, 4, 5).sum(); // 15
double avg = IntStream.of(1, 2, 3, 4, 5).average()
.orElse(0); // 3.0
int max = IntStream.of(1, 2, 3, 4, 5).max()
.orElse(0); // 5
int min = IntStream.of(1, 2, 3, 4, 5).min()
.orElse(0); // 1
// Get all stats at once
IntSummaryStatistics stats =
IntStream.of(1, 2, 3, 4, 5).summaryStatistics();
Converting Between Stream Types
// Primitive → Boxed (Integer, Long, Double)
Stream<Integer> boxed = IntStream.of(1, 2, 3).boxed();
// Boxed → Primitive
IntStream primitive = Stream.of(1, 2, 3)
.mapToInt(n -> n);
// IntStream → LongStream
LongStream longs = IntStream.of(1, 2, 3).asLongStream();
// IntStream → DoubleStream
DoubleStream doubles = IntStream.of(1, 2, 3).asDoubleStream();
graph TD A["Regular Stream"] --> B["Boxed: Integer, Long, Double"] B --> C["Extra Memory"] D["Primitive Stream"] --> E["Raw: int, long, double"] E --> F["Fast & Light!"] B <--> |mapToInt, boxed| E
Key Point
Primitive streams (
IntStream,LongStream,DoubleStream) skip boxing for better performance with numbers!
Putting It All Together: The Water Park Report 📊
record Visitor(String name, int age, String ride, int visits) {}
List<Visitor> visitors = List.of(
new Visitor("Emma", 8, "WavePool", 5),
new Visitor("Liam", 12, "Slide", 3),
new Visitor("Noah", 7, "WavePool", 4),
new Visitor("Olivia", 10, "Slide", 6)
);
// 1. Basic Collector: All names
String allNames = visitors.stream()
.map(Visitor::name)
.collect(Collectors.joining(", "));
// "Emma, Liam, Noah, Olivia"
// 2. Grouping: Visitors by ride
Map<String, List<Visitor>> byRide = visitors.stream()
.collect(Collectors.groupingBy(Visitor::ride));
// 3. Aggregation: Average visits per ride
Map<String, Double> avgVisits = visitors.stream()
.collect(Collectors.groupingBy(
Visitor::ride,
Collectors.averagingInt(Visitor::visits)
));
// WavePool → 4.5, Slide → 4.5
// 4. Parallel: Fast total visits count
int totalVisits = visitors.parallelStream()
.mapToInt(Visitor::visits)
.sum();
// 18
// 5. Primitive: Age statistics
IntSummaryStatistics ageStats = visitors.stream()
.mapToInt(Visitor::age)
.summaryStatistics();
// min=7, max=12, avg=9.25, sum=37, count=4
Quick Reference Card 🗂️
| Collector | What It Does | Returns |
|---|---|---|
toList() |
Collects to List | List |
toSet() |
Collects to Set | Set |
joining(", ") |
Joins strings | String |
counting() |
Counts elements | Long |
groupingBy(fn) |
Groups by key | Map |
partitioningBy(p) |
Splits true/false | Map |
summingInt(fn) |
Adds numbers | Integer |
averagingInt(fn) |
Calculates average | Double |
maxBy(cmp) |
Finds maximum | Optional |
minBy(cmp) |
Finds minimum | Optional |
You Did It! 🎉
You’ve learned how Java’s Collectors and Streams work together:
- Basic Collectors → Package your data (Lists, Sets, Strings)
- Grouping Collectors → Organize by categories
- Aggregation Collectors → Crunch the numbers
- Parallel Streams → Speed boost for big data
- Primitive Streams → Lightweight number handling
Now go build something amazing! The data is flowing, and you know exactly how to catch it. 🌊
