🗄️ Data Frames in R: Your Magic Spreadsheet
The Big Picture: What’s a Data Frame?
Imagine you have a magic notebook where you can organize anything—names, ages, favorite colors, test scores—all in neat rows and columns. That’s exactly what a Data Frame is in R!
Think of it like a classroom roster:
- Each row = one student
- Each column = one piece of info about that student (name, age, grade)
Data Frames are the heart of data analysis in R. Almost everything you do with real data uses them!
🎨 Our Analogy: The Classroom Roster
Throughout this guide, we’ll use a classroom roster as our example. Picture a teacher organizing information about students—that’s you building and working with Data Frames!
📝 Creating Data Frames
The Basic Recipe
Creating a Data Frame is like filling out a class roster. You list each column of information:
students <- data.frame(
name = c("Emma", "Liam", "Ava"),
age = c(10, 11, 10),
grade = c("A", "B", "A")
)
What just happened?
name,age,grade= column names (like headers in your roster)c()= combines values into a list- Each student gets one value from each column
See Your Creation
print(students)
Output:
name age grade
1 Emma 10 A
2 Liam 11 B
3 Ava 10 A
That’s it! You made a Data Frame with 3 students and 3 pieces of info each!
🔍 Data Frame Column Access
Grabbing One Column
Want just the names? Use the $ sign—like pointing at a column!
students$name
# Output: "Emma" "Liam" "Ava"
Using Square Brackets
You can also use brackets with the column name:
students["age"]
students[, "age"] # Same thing!
students[, 2] # Column 2 = age
Quick tip: The $ method is fastest to type!
graph TD A[Data Frame] --> B["$ dollar sign"] A --> C["[ ] brackets"] B --> D["students$name"] C --> E["students['name']"] C --> F["students[, 2]"]
👤 Data Frame Row Access
Getting One Row
Want all info about Emma (row 1)? Use brackets with row number:
students[1, ]
# Shows: Emma, 10, A
Getting Multiple Rows
Want Emma and Ava (rows 1 and 3)?
students[c(1, 3), ]
The Pattern
students[ROW, COLUMN]
- Leave ROW empty = all rows
- Leave COLUMN empty = all columns
- Fill both = specific cell!
students[2, 3] # Row 2, Column 3 = "B"
🔎 Data Frame Inspection
Before working with data, you need to peek inside! Here are your detective tools:
Quick Look Commands
| Command | What It Does |
|---|---|
head(students) |
Shows first 6 rows |
tail(students) |
Shows last 6 rows |
nrow(students) |
Counts rows (students) |
ncol(students) |
Counts columns (info types) |
dim(students) |
Shows rows Ă— columns |
names(students) |
Lists column names |
str(students) |
Shows structure & types |
summary(students) |
Statistics overview |
Example Inspection
str(students)
Output:
'data.frame': 3 obs. of 3 variables:
$ name : chr "Emma" "Liam" "Ava"
$ age : num 10 11 10
$ grade: chr "A" "B" "A"
This tells you: 3 students, 3 variables, and their data types!
🎯 Data Frame Filtering
Filtering = finding specific students that match what you’re looking for.
Find Students by Condition
Who is 10 years old?
students[students$age == 10, ]
Output:
name age grade
1 Emma 10 A
3 Ava 10 A
How It Works
students$age == 10creates TRUE/FALSE for each rowTRUErows get selectedFALSErows get ignored
More Filtering Examples
# Who got an A?
students[students$grade == "A", ]
# Who is older than 10?
students[students$age > 10, ]
# Who is 10 AND got an A?
students[students$age == 10 &
students$grade == "A", ]
The subset() Shortcut
subset(students, age == 10)
# Same result, cleaner to read!
✏️ Data Frame Modification
Your roster isn’t set in stone! You can change, add, or remove things.
Change One Value
Emma turned 11:
students[1, "age"] <- 11
Add a New Column
Add a column for favorite subject:
students$subject <- c("Math", "Art", "Science")
Now your Data Frame has 4 columns!
Remove a Column
Changed your mind about that column?
students$subject <- NULL
Add a New Row
New student joined!
new_student <- data.frame(
name = "Noah",
age = 11,
grade = "B"
)
students <- rbind(students, new_student)
rbind = row bind = glue rows together!
Change Column Names
names(students)[1] <- "student_name"
# Column 1 is now "student_name"
đź”— Data Frame Combining
Sometimes you have multiple rosters and need to merge them!
Stack Rows: rbind()
Two classes? Stack them vertically:
class_a <- data.frame(
name = c("Emma", "Liam"),
age = c(10, 11)
)
class_b <- data.frame(
name = c("Ava", "Noah"),
age = c(10, 11)
)
all_students <- rbind(class_a, class_b)
Result: 4 students, 2 columns
Add Columns: cbind()
Have extra info to add side-by-side?
grades <- data.frame(
grade = c("A", "B", "A", "B")
)
full_roster <- cbind(all_students, grades)
Result: 4 students, 3 columns
Smart Merge: merge()
Combine based on matching values:
roster <- data.frame(
id = c(1, 2, 3),
name = c("Emma", "Liam", "Ava")
)
scores <- data.frame(
id = c(1, 2, 3),
score = c(95, 87, 92)
)
combined <- merge(roster, scores, by = "id")
Result:
id name score
1 1 Emma 95
2 2 Liam 87
3 3 Ava 92
The id column matched up the right scores with the right students!
graph TD A[Combining Methods] --> B[rbind] A --> C[cbind] A --> D[merge] B --> E["Stack rows vertically"] C --> F["Add columns side-by-side"] D --> G["Join by matching column"]
🚀 Quick Reference Summary
| Task | Code |
|---|---|
| Create | data.frame(col1 = ..., col2 = ...) |
| Get column | df$column or df[, "column"] |
| Get row | df[row_num, ] |
| Get cell | df[row, column] |
| Inspect | str(df), summary(df), dim(df) |
| Filter | df[df$col == value, ] or subset() |
| Add column | df$new_col <- values |
| Add row | rbind(df, new_row) |
| Remove column | df$col <- NULL |
| Stack rows | rbind(df1, df2) |
| Add columns | cbind(df1, df2) |
| Smart join | merge(df1, df2, by = "key") |
🎉 You Did It!
You now understand the 7 core skills of Data Frames:
- âś… Creating them from scratch
- âś… Accessing columns with
$and[] - âś… Accessing rows with bracket notation
- âś… Inspecting with
str(),summary(), and friends - âś… Filtering to find exactly what you need
- âś… Modifying values, adding/removing columns and rows
- âś… Combining Data Frames with
rbind,cbind, andmerge
Data Frames are your superpower for working with real data in R. Now go organize some data! 🗂️