Apply Functions

Back

Loading concept...

🎯 R Apply Functions: Your Army of Helpful Assistants

Imagine you have a magic helper who can do the same task on many things at once. That’s what Apply Functions are in R!


🌟 The Big Picture: Why Apply Functions?

Think about this: You have 100 lunchboxes, and you need to open each one and count the candies inside. Would you rather:

  1. Open each lunchbox one by one (boring, slow, tiring) πŸ₯±
  2. Tell a magical assistant to open ALL lunchboxes and count candies in one go! ✨

Apply functions are your magical assistants! They take a task (like counting) and do it across many items instantly.


🧺 The apply() Function: The Matrix Master

What Is It?

apply() works on matrices (think of them as tables with rows and columns). It applies a function to either:

  • All rows (going across ➑️)
  • All columns (going down ⬇️)

The Magic Word (Syntax)

apply(matrix, MARGIN, FUN)
  • matrix = Your table of numbers
  • MARGIN = 1 for rows, 2 for columns
  • FUN = What you want to do (sum, mean, etc.)

🎨 Example: A Candy Box Grid

Imagine a table showing how many candies 3 kids got on 4 days:

candy_box <- matrix(
  c(5,3,7,2, 4,6,8,1, 3,5,9,4),
  nrow = 3, byrow = TRUE
)
rownames(candy_box) <- c("Amy", "Bob", "Cat")
colnames(candy_box) <- c("Mon","Tue","Wed","Thu")

Find total candies per kid (rows):

apply(candy_box, 1, sum)
# Amy: 17, Bob: 19, Cat: 21

Find average candies per day (columns):

apply(candy_box, 2, mean)
# Mon: 4, Tue: 4.67, Wed: 8, Thu: 2.33
graph TD A["Matrix/Table"] --> B{MARGIN?} B -->|1| C["Apply to each ROW"] B -->|2| D["Apply to each COLUMN"] C --> E["Get one result per row"] D --> F["Get one result per column"]

πŸ“‹ lapply() and sapply(): The List Twins

Meet the Twins!

These two work on lists (think of a list as a bag that can hold anything: numbers, words, even other bags!).

Function What It Returns
lapply() Always a list
sapply() Simplified (vector or matrix if possible)

🎈 Example: Birthday Balloons

You have lists of balloon counts for 3 parties:

parties <- list(
  party1 = c(5, 8, 3),
  party2 = c(10, 12),
  party3 = c(7, 7, 7, 7)
)

Count balloons at each party:

lapply(parties, sum)
# Returns a list:
# $party1 = 16
# $party2 = 22
# $party3 = 28

sapply(parties, sum)
# Returns a simple vector:
# party1 party2 party3
#     16     22     28

πŸ€” Which Twin to Choose?

  • Use lapply() when you need a list (safer, predictable)
  • Use sapply() when you want simpler output (convenient)

πŸ›‘οΈ vapply(): The Careful Guardian

Why Be Careful?

sapply() is convenient but sometimes gives surprises. vapply() is like saying:

β€œI expect THIS type of answer, and if I get something different, WARN ME!”

The Safety Spell (Syntax)

vapply(list, FUN, FUN.VALUE)
  • FUN.VALUE = A template of what you expect to get back

πŸ”’ Example: Expecting Numbers

ages <- list(
  class1 = c(10, 11, 10),
  class2 = c(11, 12, 11, 12)
)

# Safely get the average age per class
vapply(ages, mean, FUN.VALUE = numeric(1))
# class1 class2
#  10.33  11.50

If something goes wrong, vapply() will stop and tell you!


🏷️ tapply(): The Grouper

What Makes It Special?

tapply() groups your data by categories and then applies a function to each group.

Think of sorting toys by color, then counting each pile!

The Sorting Hat Spell

tapply(values, groups, FUN)

🍎 Example: Fruit Counting

fruits <- c(3, 5, 2, 7, 4, 6)
types <- c("apple","apple","banana",
           "banana","apple","banana")

tapply(fruits, types, sum)
# apple banana
#    12     15
graph TD A["Data + Groups"] --> B["tapply"] B --> C["Group 1: apples"] B --> D["Group 2: bananas"] C --> E["Apply function to apples"] D --> F["Apply function to bananas"] E --> G["Result for apples: 12"] F --> H["Result for bananas: 15"]

🎭 mapply(): The Parallel Performer

The Multiverse Helper

What if you want to apply a function to multiple lists at the same time? Like adding ingredients from two recipes together?

mapply() = multiple + apply

The Parallel Dance

mapply(FUN, list1, list2, ...)

🎁 Example: Gift Wrapping

boxes <- c(2, 3, 4)
ribbons_per_box <- c(3, 2, 1)

# Total ribbons needed for each type
mapply(function(b, r) b * r,
       boxes, ribbons_per_box)
# Result: 6, 6, 4

This multiplies each box count by its ribbon requirement in parallel!


πŸ“Š aggregate(): The Summary Maker

The Report Card Function

aggregate() is perfect when you have data in a table format and want to:

  1. Group by one or more categories
  2. Get summaries (mean, sum, count, etc.)

The Report Spell

aggregate(value ~ group, data, FUN)

πŸ“ Example: Student Scores

scores <- data.frame(
  name = c("Amy","Amy","Bob","Bob"),
  subject = c("Math","Eng","Math","Eng"),
  score = c(90, 85, 78, 92)
)

# Average score per student
aggregate(score ~ name, scores, mean)
#   name score
# 1  Amy  87.5
# 2  Bob  85.0

# Average score per subject
aggregate(score ~ subject, scores, mean)
#   subject score
# 1     Eng  88.5
# 2    Math  84.0

βœ‚οΈ split(): The Divider

The Sorting Box

split() takes your data and divides it into groups based on a factor. It returns a list where each element is one group.

The Divide Spell

split(data, groups)

🎨 Example: Sorting Marbles

marbles <- c(5, 8, 3, 7, 2, 9)
colors <- c("red","blue","red",
            "blue","red","blue")

split(marbles, colors)
# $blue = c(8, 7, 9)
# $red = c(5, 3, 2)

πŸ”— Power Combo: split + lapply

Often you’ll use split() with lapply() for advanced grouping:

groups <- split(marbles, colors)
lapply(groups, mean)
# $blue = 8
# $red = 3.33

πŸ—ΊοΈ Quick Reference Map

graph TD A["Your Data"] --> B{What type?} B -->|Matrix| C["apply"] B -->|List/Vector| D{Need safety?} D -->|Yes| E["vapply"] D -->|No, want list| F["lapply"] D -->|No, want simple| G["sapply"] B -->|Grouped data| H["tapply"] B -->|Multiple inputs| I["mapply"] B -->|DataFrame summary| J["aggregate"] B -->|Need to split first| K["split"]

🎯 The Family at a Glance

Function Works On Special Power
apply() Matrix Rows OR Columns
lapply() List Always returns list
sapply() List Simplifies output
vapply() List Type-safe output
tapply() Vector + Groups Groups then applies
mapply() Multiple lists Parallel processing
aggregate() DataFrame SQL-like grouping
split() Vector/DataFrame Divides into groups

πŸš€ You’re Now an Apply Master!

Remember: These functions are your helpers. Instead of writing loops to do the same thing over and over, you tell your helper ONCE what to do, and it does it everywhere!

Next time you think β€œI need a loop”… think β€œWhich apply friend can help me?” πŸŽ‰

Loading story...

Story - Premium Content

Please sign in to view this story and start learning.

Upgrade to Premium to unlock full access to all stories.

Stay Tuned!

Story is coming soon.

Story Preview

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.