What is hypothesis testing?

Hypothesis testing is like being a detective - you start with a guess, collect data as evidence, then use math to decide if your guess is probably true or not.

What does a p-value mean in hypothesis testing?

The p-value is your 'surprise meter'. If p = 0.05, results could just be random chance.

When should I use t-test vs ANOVA?

Use t-test when comparing averages of 2 groups. Use ANOVA when comparing 3 or more groups to see if at least one is different.

When should I use the Wilcoxon test instead of t-test?

Use Wilcoxon when your data isn't bell-shaped (normal) or when working with rankings instead of exact numbers. Check with Shapiro-Wilk first.

Hypothesis Testing in R | Statistical Tests Guide

🎯 Hypothesis Testing in R: The Detective’s Toolkit

The Big Picture: Becoming a Data Detective 🔍

Imagine you’re a detective. You have a hunch about something, but you need proof before you can say it’s true. That’s exactly what hypothesis testing is!

Think of it like this: Your friend says their new cookie recipe is better than the old one. How do you prove it? You need to test it fairly!

In statistics, we:

Start with a guess (called a hypothesis)
Collect evidence (data)
Use math to decide if our guess is probably true or not

🎭 The Two Players: Null vs Alternative

Every hypothesis test has two characters:

Character	Role	Example
Null (H₀)	“Nothing special is happening”	“Both cookie recipes taste the same”
Alternative (H₁)	“Something IS different!”	“The new recipe tastes different”

The p-value is your “surprise meter”:

p < 0.05 → “Wow, this is surprising! Probably not a coincidence!” ✅
p ≥ 0.05 → “Meh, could just be luck” ❌

📊 1. The t-Test: Comparing Averages

What is it?

The t-test answers: “Are these two groups really different, or is it just random chance?”

🍦 Ice Cream Analogy

Two ice cream shops claim their scoops are bigger. You weigh 10 scoops from each shop. The t-test tells you if the difference is real or just luck!

Types of t-Tests

Type	When to Use	R Function
One-sample	Compare group to a known value	`t.test(x, mu = value)`
Two-sample	Compare two independent groups	`t.test(x, y)`
Paired	Same people, two conditions	`t.test(x, y, paired = TRUE)`

R Example

# Shop A scoops (grams)
shop_a <- c(85, 90, 88, 92, 87)

# Shop B scoops (grams)
shop_b <- c(78, 82, 80, 79, 81)

# Are they different?
t.test(shop_a, shop_b)

Output says p-value = 0.002 → Yes! Shop A really does give bigger scoops! 🎉

📋 2. Chi-Square Test: Counting Categories

What is it?

When you’re counting things in categories (not measuring numbers), Chi-Square asks: “Is this pattern what we expected, or is something fishy?”

🎲 Dice Analogy

You roll a die 60 times. You expect each number to appear about 10 times. But 6 shows up 20 times! Is the die rigged, or just luck?

R Example

# What we observed
observed <- c(8, 9, 12, 10, 11, 20)

# What we expected (fair die)
expected <- c(10, 10, 10, 10, 10, 10)

# Is the die fair?
chisq.test(observed, p = expected/60)

If p < 0.05 → That die is probably loaded! 🎲

🔗 3. Correlation Test: Finding Connections

What is it?

Correlation measures how two things move together. Do they go up together? One up, one down? Or no pattern at all?

🌡️ Weather Analogy

When it’s hot outside, ice cream sales go up. That’s a positive correlation!

Correlation Values

Value	Meaning
+1	Perfect together (both rise)
0	No relationship
-1	Perfect opposites (one rises, other falls)

R Example

# Temperature (°F)
temp <- c(70, 75, 80, 85, 90)

# Ice cream sales
sales <- c(100, 120, 150, 180, 200)

# Are they connected?
cor.test(temp, sales)

If p < 0.05 and r ≈ 0.98 → Yes! Hot weather really does boost sales! ☀️🍦

🏆 4. ANOVA: Comparing Many Groups

What is it?

ANOVA is like a t-test, but for 3 or more groups. It asks: “Is at least one group different from the others?”

🏅 Sports Analogy

Three coaches claim their training methods are best. You test athletes from all three programs. ANOVA tells you if there’s any real difference!

R Example

# Scores from three coaches
coach_a <- c(85, 88, 90, 87)
coach_b <- c(78, 80, 82, 79)
coach_c <- c(92, 95, 91, 93)

# Combine data
scores <- c(coach_a, coach_b, coach_c)
coach <- factor(rep(c("A","B","C"), each=4))

# Run ANOVA
result <- aov(scores ~ coach)
summary(result)

If p < 0.05 → At least one coach’s method is different! (Coach C looks best!)

📏 5. Variance Test (F-Test): Comparing Spread

What is it?

While t-tests compare averages, variance tests compare how spread out the data is. Are scores more scattered in one group?

🎯 Archery Analogy

Two archers both hit near the bullseye on average. But one archer’s arrows are tightly clustered, while the other’s are all over the target. The variance test measures this!

R Example

# Archer 1: consistent
archer1 <- c(49, 50, 51, 50, 49)

# Archer 2: all over the place
archer2 <- c(45, 55, 40, 60, 50)

# Compare their consistency
var.test(archer1, archer2)

If p < 0.05 → Yes! Their consistency levels are different!

📊 6. Wilcoxon Test: The Non-Parametric Hero

What is it?

When your data is weird (not bell-shaped) or you’re working with rankings instead of exact numbers, Wilcoxon comes to the rescue!

🥇 Race Analogy

Instead of exact race times, you only know who came 1st, 2nd, 3rd, etc. Wilcoxon works with these rankings!

Two Flavors

Test	When to Use
Wilcoxon Signed-Rank	Paired data (like paired t-test)
Wilcoxon Rank-Sum (Mann-Whitney)	Two independent groups

R Example

# Pain scores before treatment
before <- c(8, 9, 7, 8, 9, 8)

# Pain scores after treatment
after <- c(5, 6, 4, 5, 6, 5)

# Did treatment help?
wilcox.test(before, after, paired = TRUE)

If p < 0.05 → The treatment really works! 💊

📈 7. Proportion Test: Comparing Percentages

What is it?

When you’re comparing percentages or ratios, not averages, use the proportion test!

🗳️ Voting Analogy

60% of Town A voted yes, but only 45% of Town B voted yes. Is this a real difference, or just random variation?

R Example

# Town A: 60 yes out of 100
# Town B: 45 yes out of 100

prop.test(
  x = c(60, 45),  # successes
  n = c(100, 100) # totals
)

If p < 0.05 → Towns really do vote differently! 🗳️

🔔 8. Shapiro-Wilk Test: Is Your Data Normal?

What is it?

Many tests assume your data follows a “bell curve” (normal distribution). Shapiro-Wilk checks if that’s true!

📊 Why It Matters

Before using a t-test, you should check if your data is bell-shaped. If not, use Wilcoxon instead!

R Example

# Your data
my_data <- c(12, 15, 14, 13, 16,
             14, 15, 13, 14, 15)

# Is it normally distributed?
shapiro.test(my_data)

If p > 0.05 → Data is normal! Use t-test! ✅ If p < 0.05 → Data is NOT normal! Use Wilcoxon! ⚠️

🗺️ Quick Decision Flowchart

graph TD
    A["What do you want to test?"] --> B{Comparing averages?}
    B -->|Yes, 2 groups| C["t-test"]
    B -->|Yes, 3+ groups| D["ANOVA"]
    B -->|No| E{Counting categories?}
    E -->|Yes| F["Chi-Square"]
    E -->|No| G{Measuring correlation?}
    G -->|Yes| H["Correlation Test"]
    G -->|No| I{Comparing spread?}
    I -->|Yes| J["Variance Test"]
    I -->|No| K{Comparing percentages?}
    K -->|Yes| L["Proportion Test"]
    K -->|No| M{Non-normal data?}
    M -->|Yes| N["Wilcoxon Test"]
    M -->|Check first| O["Shapiro-Wilk"]

🎯 Summary: Your Testing Toolkit

Test	Use When…	R Function
t-Test	Comparing 2 group averages	`t.test()`
Chi-Square	Counting categorical data	`chisq.test()`
Correlation	Finding relationships	`cor.test()`
ANOVA	Comparing 3+ group averages	`aov()`
Variance (F-Test)	Comparing data spread	`var.test()`
Wilcoxon	Non-normal data, rankings	`wilcox.test()`
Proportion	Comparing percentages	`prop.test()`
Shapiro-Wilk	Checking if data is normal	`shapiro.test()`

🚀 You’ve Got This!

Remember: Every test is just asking a simple question with data. Start with what you want to know, pick the right tool, and let R do the math!

The golden rule:

p < 0.05 → “Something real is happening!” ✅
p ≥ 0.05 → “Probably just random chance” ❌

Now go forth and test your hypotheses! 🔬✨

Hypothesis Testing

Unable to load concept

Coming Soon...

🎯 Hypothesis Testing in R: The Detective’s Toolkit

The Big Picture: Becoming a Data Detective 🔍

🎭 The Two Players: Null vs Alternative

📊 1. The t-Test: Comparing Averages

What is it?

🍦 Ice Cream Analogy

Types of t-Tests

R Example

📋 2. Chi-Square Test: Counting Categories

What is it?

🎲 Dice Analogy

R Example

🔗 3. Correlation Test: Finding Connections

What is it?

🌡️ Weather Analogy

Correlation Values

R Example

🏆 4. ANOVA: Comparing Many Groups

What is it?

🏅 Sports Analogy

R Example

📏 5. Variance Test (F-Test): Comparing Spread

What is it?

🎯 Archery Analogy

R Example

📊 6. Wilcoxon Test: The Non-Parametric Hero

What is it?

🥇 Race Analogy

Two Flavors

R Example

📈 7. Proportion Test: Comparing Percentages

What is it?

🗳️ Voting Analogy

R Example

🔔 8. Shapiro-Wilk Test: Is Your Data Normal?

What is it?

📊 Why It Matters

R Example

🗺️ Quick Decision Flowchart

🎯 Summary: Your Testing Toolkit

🚀 You’ve Got This!

Story - Premium Content

Stay Tuned!

Story - Premium Content

Interactive - Premium Content

Interactive - Premium Content

Stay Tuned!

Cheatsheet - Premium Content

Cheatsheet - Premium Content

Stay Tuned!

Quiz - Premium Content

Quiz - Premium Content

Stay Tuned!

Flashcard - Premium Content

Flashcard - Premium Content

Stay Tuned!

Sign in Required

Report an Issue