Distribution Functions

Back

Loading concept...

🎲 Random Variables: Distribution Functions

The Magic Jar Story 🫙

Imagine you have a magic jar filled with numbered balls. Every time you pick a ball, you get a number. But here’s the twist: some numbers appear more often than others!

This magic jar is exactly what mathematicians call a Random Variable. The rules about which numbers you’re likely to pick? Those are Distribution Functions!


🎯 What We’ll Discover

graph TD A["Random Variable"] --> B["PMF"] A --> C["PDF"] A --> D["CDF"] A --> E["Expected Value"] A --> F["Variance"] B --> G["Counting discrete outcomes"] C --> H["Measuring continuous outcomes"] D --> I["Cumulative probability"] E --> J["Average result"] F --> K["How spread out?"]

📊 Probability Mass Function (PMF)

The Counting Machine

Think of PMF as a counting machine for things you can count on your fingers!

PMF tells you: “What’s the chance of getting EXACTLY this number?”

Real Life Example: Rolling a Die 🎲

When you roll a fair die:

  • Chance of getting 1 = 1/6
  • Chance of getting 2 = 1/6
  • Chance of getting 3 = 1/6
  • …and so on!

The PMF is like a list showing the probability for each possible outcome.

Simple Formula

$P(X = x)$

This means: “The probability that our random variable X equals exactly x”

Visual Picture

Outcome Probability
1 1/6 ≈ 0.167
2 1/6 ≈ 0.167
3 1/6 ≈ 0.167
4 1/6 ≈ 0.167
5 1/6 ≈ 0.167
6 1/6 ≈ 0.167

🔑 Key Rule

All probabilities must add up to 1!

$\sum_{all\ x} P(X = x) = 1$

In our die example: 1/6 + 1/6 + 1/6 + 1/6 + 1/6 + 1/6 = 1 ✓


📈 Probability Density Function (PDF)

The Smooth Curve Machine

Now imagine instead of numbered balls, your jar has sand of different colors blending smoothly together. You can’t count grains—you measure regions!

PDF is for continuous things — like height, weight, or time.

The Key Difference

PMF (Counting) PDF (Measuring)
Dice rolls Heights
Number of kids Temperatures
Coin flips Time to finish race

Important Secret 🤫

With PDF, the probability of getting exactly one number is actually zero!

Instead, we ask: “What’s the chance of being between two values?”

Example: Heights of Kids

If heights follow a normal distribution (bell curve):

  • Very few kids are extremely short
  • Very few kids are extremely tall
  • Most kids are somewhere in the middle!
graph TD A["PDF for Heights"] --> B["Area under curve = Probability"] B --> C["Total area always = 1"] C --> D["Probability of exact height = 0"] D --> E["Ask: Between X and Y?"]

The Formula

$P(a \leq X \leq b) = \int_a^b f(x),dx$

Translation: “Add up all the tiny slices between a and b”


📉 Cumulative Distribution Function (CDF)

The “So Far” Counter

CDF answers: “What’s the probability of getting this number OR LESS?”

Think of it like filling a bucket. CDF tells you how full your bucket is at each point!

Example: Test Scores

If 70% of students scored 80 or below:

  • CDF at 80 = 0.70

This means: “70% of all results are 80 or less”

Visual Relationship

graph TD A["PMF/PDF"] -->|Add up all previous| B["CDF"] B --> C["Always starts near 0"] C --> D["Always ends at 1"] D --> E["Never decreases!"]

The Formula

$F(x) = P(X \leq x)$

For discrete: Add all probabilities up to x For continuous: Integrate from -∞ to x

Die Example Again 🎲

Roll ≤ CDF Value
1 1/6 = 0.167
2 2/6 = 0.333
3 3/6 = 0.500
4 4/6 = 0.667
5 5/6 = 0.833
6 6/6 = 1.000

Notice: CDF always goes UP and ends at 1!


🎯 Expected Value (Mean)

The Balance Point

Expected Value is like asking: “If I played this game forever, what would be my average result?”

Think of a seesaw. The Expected Value is where you’d put the fulcrum to balance all outcomes!

Simple Example: Coin Toss Game

You win ₹10 for heads, lose ₹5 for tails.

Expected Value = (0.5 × ₹10) + (0.5 × -₹5) Expected Value = ₹5 - ₹2.50 = ₹2.50

On average, you’d win ₹2.50 per game!

The Formula

Discrete (PMF): $E[X] = \sum_x x \cdot P(X = x)$

Continuous (PDF): $E[X] = \int_{-\infty}^{\infty} x \cdot f(x),dx$

Die Roll Expected Value 🎲

E[X] = 1×(1/6) + 2×(1/6) + 3×(1/6) + 4×(1/6) + 5×(1/6) + 6×(1/6)

E[X] = (1+2+3+4+5+6)/6 = 21/6 = 3.5

The “average” die roll is 3.5 (even though you can never roll 3.5!)

🧠 Key Insight

Expected Value tells you the center of your distribution—the long-run average if you repeated the experiment infinitely.


📏 Variance of Random Variable

The Spread Meter

Variance measures: “How spread out are the results?”

Two games might have the same average, but very different excitement levels!

Story Time 🎭

Game A: Always win exactly ₹100 Game B: Win ₹0 or ₹200 (50-50 chance)

Both have Expected Value = ₹100. But Game B is way more unpredictable!

Variance captures this difference!

The Formula

$Var(X) = E[(X - \mu)^2] = E[X^2] - (E[X])^2$

Where μ = E[X] (the expected value)

Step-by-Step: Die Roll Variance 🎲

Step 1: We know E[X] = 3.5

Step 2: Find E[X²] E[X²] = 1²×(1/6) + 2²×(1/6) + 3²×(1/6) + 4²×(1/6) + 5²×(1/6) + 6²×(1/6) E[X²] = (1+4+9+16+25+36)/6 = 91/6 ≈ 15.17

Step 3: Calculate Variance Var(X) = E[X²] - (E[X])² Var(X) = 15.17 - (3.5)² Var(X) = 15.17 - 12.25 = 2.92

Standard Deviation

Want variance in original units? Take the square root!

$\sigma = \sqrt{Var(X)}$

For our die: σ = √2.92 ≈ 1.71

What Does This Mean?

Variance Interpretation
Small Results cluster near average
Large Results spread far from average
Zero Same result every time!

🔗 How They All Connect

graph TD A["Random Variable X"] --> B["Distribution Function"] B --> C{Type?} C -->|Countable| D["PMF"] C -->|Continuous| E["PDF"] D --> F["CDF"] E --> F F --> G["Expected Value E X"] G --> H["Variance Var X"] H --> I["Standard Deviation σ"]

The Big Picture

  1. PMF/PDF → Describes probability for each outcome
  2. CDF → Cumulative “how much so far”
  3. Expected Value → The center/average
  4. Variance → The spread/uncertainty

🎓 Quick Summary

Concept Question It Answers Formula
PMF P(exactly x)? P(X = x)
PDF P(between a and b)? ∫f(x)dx
CDF P(x or less)? F(x) = P(X ≤ x)
E[X] Average result? Σx·P(x)
Var(X) How spread out? E[X²] - (E[X])²

💡 Remember This!

🎲 PMF = For counting (dice, coins) 📏 PDF = For measuring (height, time) 📊 CDF = Adding it all up so far 🎯 Expected Value = The balance point 📐 Variance = The wiggle room

You’ve just learned the language that statisticians use to describe uncertainty. Now go forth and understand the randomness around you! 🚀

Loading story...

Story - Premium Content

Please sign in to view this story and start learning.

Upgrade to Premium to unlock full access to all stories.

Stay Tuned!

Story is coming soon.

Story Preview

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.