Tidyverse Tools

Back

Loading concept...

🧰 The Tidyverse Toolbox: Your Swiss Army Knife for Data!

Imagine you have a magic toolbox. Each tool inside helps you do one special thing with your data. Today, we’re going to open this toolbox and learn about seven amazing tools!

Think of data like LEGO blocks. Sometimes you need to:

  • Cut and shape text (like cutting paper) → stringr
  • Do the same thing to many blocks at once → purrr map
  • Combine all blocks into one → purrr reduce
  • Read dates (like reading a calendar) → lubridate parsing
  • Pull out pieces of dates → lubridate extraction
  • Measure time between dates → lubridate duration
  • Read files into R → readr

Let’s explore each tool!


🧵 stringr: The Text Tailor

What is it? stringr helps you work with words and sentences—just like a tailor works with fabric!

The Magic Scissors: str_sub()

Want to cut out part of a word? Use str_sub()!

library(stringr)

word <- "RAINBOW"
str_sub(word, 1, 4)
# "RAIN" — first 4 letters!

str_sub(word, -3, -1)
# "BOW" — last 3 letters!

Think of it like cutting a piece of ribbon. You tell R where to start and where to stop!

The Word Detector: str_detect()

Does your word contain something? Ask str_detect()!

str_detect("I love pizza", "pizza")
# TRUE — yes, pizza is there!

str_detect("I love pizza", "taco")
# FALSE — no tacos here!

Find and Replace: str_replace()

Found something? Want to swap it? Easy!

str_replace("I like cats", "cats", "dogs")
# "I like dogs"

Split It Up: str_split()

Break a sentence into pieces!

str_split("apple-banana-cherry", "-")
# "apple" "banana" "cherry"

Like cutting a string of beads!

Other Handy stringr Tools

Function What It Does Example
str_length() Counts characters str_length("hello") → 5
str_to_upper() MAKES ALL CAPS str_to_upper("hi") → “HI”
str_to_lower() makes all small str_to_lower("HI") → “hi”
str_trim() Removes extra spaces str_trim(" hi ") → “hi”
str_c() Glues strings together str_c("a","b") → “ab”

🗺️ purrr Map: The Copy Machine

What is it? Imagine you need to put a stamp on 100 letters. Would you do it one by one? No! You’d use a machine!

map() is that machine. It does the same thing to every item in a list!

Basic map()

library(purrr)

numbers <- list(1, 2, 3, 4)

# Add 10 to each number
map(numbers, ~ .x + 10)
# 11, 12, 13, 14

The ~ means “do this”, and .x is each item!

map() Variants: Choose Your Output

Function Returns Example
map() List map(1:3, ~ .x * 2) → list(2,4,6)
map_dbl() Numbers map_dbl(1:3, ~ .x * 2) → 2 4 6
map_chr() Text map_chr(1:3, ~ paste0("item", .x))
map_lgl() TRUE/FALSE map_lgl(1:3, ~ .x > 2)
map_int() Integers map_int(1:3, ~ as.integer(.x * 2))

map2(): Two Lists at Once!

What if you have two lists and want to combine them?

names <- list("Ana", "Bob", "Cat")
ages <- list(5, 6, 7)

map2(names, ages, ~ paste(.x, "is", .y))
# "Ana is 5" "Bob is 6" "Cat is 7"

Like a zipper joining two sides!

pmap(): Many Lists Together

Have 3+ lists? Use pmap()!

first <- list("A", "B")
middle <- list("X", "Y")
last <- list("1", "2")

pmap(list(first, middle, last),
     ~ paste(..1, ..2, ..3))
# "A X 1" "B Y 2"

🎯 purrr Reduce: The Combiner

What is it? Imagine you have a pile of cards. You pick up two, combine them, then pick up the next, combine again… until you have ONE final card!

That’s reduce()!

Simple Example

numbers <- c(1, 2, 3, 4)

reduce(numbers, `+`)
# 10 (which is 1+2+3+4)

reduce(numbers, `*`)
# 24 (which is 1*2*3*4)

reduce() with Custom Function

words <- c("I", "love", "R")

reduce(words, ~ paste(.x, .y))
# "I love R"

Step by step:

  1. Start with “I”
  2. Combine with “love” → “I love”
  3. Combine with “R” → “I love R”

accumulate(): See Every Step

Want to see the journey, not just the destination?

accumulate(1:4, `+`)
# 1  3  6  10

Shows: 1, then 1+2=3, then 3+3=6, then 6+4=10!


📅 lubridate: The Calendar Wizard

Date Parsing: Reading Dates

Computers are picky about dates. lubridate helps them understand!

The Magic Rule: Use the function that matches your date format!

library(lubridate)

# Year-Month-Day format
ymd("2024-03-15")
# 2024-03-15

# Month-Day-Year format
mdy("03-15-2024")
# 2024-03-15

# Day-Month-Year format
dmy("15-03-2024")
# 2024-03-15

All give the same result! The function name tells R how to read it!

With Times Too!

ymd_hms("2024-03-15 14:30:00")
# 2024-03-15 14:30:00 UTC

mdy_hm("03-15-2024 2:30 PM")
# Works too!
Function Format Example Input
ymd() Year-Month-Day “2024-03-15”
mdy() Month-Day-Year “03-15-2024”
dmy() Day-Month-Year “15-03-2024”
ymd_hms() With time “2024-03-15 14:30:00”

🔍 Component Extraction: Taking Dates Apart

Once you have a date, you can pull out pieces like LEGO!

my_date <- ymd("2024-07-04")

year(my_date)    # 2024
month(my_date)   # 7
day(my_date)     # 4
wday(my_date)    # 5 (Thursday)

All the Pieces You Can Extract

Function Extracts Example
year() Year 2024
month() Month number 7
day() Day of month 4
wday() Day of week 5 (1=Sunday)
hour() Hour 14
minute() Minute 30
second() Second 45
yday() Day of year 186

Get Names Instead of Numbers

month(my_date, label = TRUE)
# "Jul"

wday(my_date, label = TRUE)
# "Thu"

⏱️ Duration and Period: Measuring Time

What’s the difference?

  • Duration: Exact seconds (like a stopwatch)
  • Period: Calendar units (like a calendar)

Durations: Exact Time

dseconds(60)     # 60 seconds
dminutes(5)      # 300 seconds (5 × 60)
dhours(2)        # 7200 seconds
ddays(1)         # 86400 seconds
dweeks(1)        # 604800 seconds

Periods: Calendar Time

seconds(60)      # 1 minute in calendar
minutes(5)       # 5 minutes
hours(2)         # 2 hours
days(1)          # 1 day
weeks(1)         # 1 week
months(1)        # 1 month
years(1)         # 1 year

Why Does It Matter?

Imagine it’s January 31st. Add 1 month:

jan31 <- ymd("2024-01-31")

jan31 + months(1)  # Period
# 2024-03-02 (Feb 31 doesn't exist!)

jan31 + ddays(30)  # Duration
# 2024-03-01 (exactly 30 days)

Calculate Time Differences

start <- ymd("2024-01-01")
end <- ymd("2024-12-31")

end - start
# 365 days

interval(start, end) / months(1)
# 12 (twelve months)

📖 readr: The File Reader

What is it? readr helps you bring data files INTO R—like opening a book to read it!

read_csv(): Comma-Separated Files

library(readr)

data <- read_csv("my_data.csv")

That’s it! readr figures out the rest!

All the Readers

Function File Type Separator
read_csv() CSV Comma (,)
read_csv2() European CSV Semicolon (;)
read_tsv() TSV Tab
read_delim() Any You choose!

read_delim(): For Special Files

# Pipe-separated file
read_delim("data.txt", delim = "|")

# Colon-separated file
read_delim("data.txt", delim = ":")

Helpful Options

read_csv("data.csv",
  skip = 2,              # Skip first 2 rows
  n_max = 100,           # Read only 100 rows
  na = c("", "NA", "?")  # Treat these as missing
)

Writing Files Too!

write_csv(my_data, "output.csv")
write_tsv(my_data, "output.tsv")

🌟 Quick Reference Flow

graph TD A["Your Data"] --> B{What do you need?} B --> C["Work with TEXT"] B --> D["Apply function to MANY items"] B --> E["COMBINE items into one"] B --> F["Work with DATES"] B --> G["READ files"] C --> C1["stringr functions"] D --> D1["purrr map functions"] E --> E1["purrr reduce"] F --> F1{Which date task?} G --> G1["readr functions"] F1 --> F2["Parse: ymd, mdy, dmy"] F1 --> F3["Extract: year, month, day"] F1 --> F4["Duration: ddays, dmonths"]

💡 Remember!

  1. stringr = Text tools (str_*)
  2. map() = Do same thing to many items
  3. reduce() = Combine many into one
  4. lubridate parsing = Turn text into dates
  5. lubridate extraction = Pull parts from dates
  6. duration/period = Measure time differences
  7. readr = Read files into R

You now have the complete Tidyverse toolbox! Each tool has its purpose. Pick the right one, and data magic happens! 🪄

Loading story...

Story - Premium Content

Please sign in to view this story and start learning.

Upgrade to Premium to unlock full access to all stories.

Stay Tuned!

Story is coming soon.

Story Preview

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.