What is dimensionality reduction?

Dimensionality reduction keeps only the most important features in data and removes the rest. It simplifies data for faster analysis and visualization.

PCA finds the directions where data spreads out the most. These principal components capture maximum variance with fewer dimensions.

What's the difference between t-SNE and UMAP?

UMAP is faster and preserves global structure better than t-SNE. t-SNE focuses on local neighborhoods but struggles with large datasets.

Dimensionality Reduction | ML Guide

Dimensionality Reduction: Squeezing a Universe Into a Pocket

The Story of Too Many Crayons

Imagine you have a giant box of 1000 crayons. That’s amazing, right? But here’s the problem: when you want to draw a simple picture, you spend hours just looking at crayons! Most pictures only need maybe 5-10 colors anyway.

Dimensionality Reduction is like being a smart artist who says: “Let me keep only the most important crayons and put the rest away.”

In data science, each “crayon” is a dimension (a feature or measurement). When you have hundreds or thousands of dimensions, your computer gets confused and slow. We need to simplify!

Why Reduce Dimensions?

The Curse of Too Much

Think about finding your favorite toy in your room:

Easy: Your room has 5 toys
Hard: Your room has 5000 toys scattered everywhere!

This is the Curse of Dimensionality. More dimensions = more problems:

Problem	What Happens
Slow computers	Like running through mud
Confused algorithms	Can’t see patterns anymore
Need MORE data	Each dimension needs examples
Visualization?	Can’t draw 100D on paper!

The Magic Solution

Dimensionality reduction finds the most important directions in your data and keeps only those. It’s like compressing a photo - you lose some tiny details, but you keep what matters!

graph TD
    A["1000 Features"] --> B["Dimensionality Reduction"]
    B --> C["3-10 Important Features"]
    C --> D["Fast Analysis"]
    C --> E["Clear Visualization"]
    C --> F["Better Predictions"]

Principal Component Analysis (PCA)

The Photographer’s Trick

Imagine you’re photographing a 3D sculpture (like a horse statue). You can only take ONE photo. What angle shows the horse best?

Bad angle: Just sees the tail
Good angle: Sees the whole side profile!

PCA finds the BEST “angles” to look at your data. These angles are called Principal Components.

How PCA Works (Simple Version)

Look at all your data points (imagine dots scattered in space)
Find the direction where dots spread out the MOST (this is PC1)
Find the next best direction (perpendicular to PC1) - this is PC2
Keep going until you have enough directions

graph TD
    A["Original Data"] --> B["Find Direction of Maximum Spread"]
    B --> C["PC1: Most Important Direction"]
    C --> D["Find Next Best Direction"]
    D --> E["PC2: Second Most Important"]
    E --> F["Continue..."]

Real Example

You measure students with 5 features:

Height, Weight, Arm Length, Leg Length, Shoe Size

PCA might discover:

PC1 = “Overall Body Size” (captures 80% of differences)
PC2 = “Body Proportions” (captures 15% of differences)

Now you only need 2 numbers instead of 5!

Variance Explained

The Importance Score

Variance means “how spread out things are.” When we do PCA, each component explains a chunk of the total variance.

Think of it like a pizza:

PC1 might eat 60% of the pizza (explains 60% of variance)
PC2 eats 25% of the pizza
PC3 eats 10%
The rest share the crumbs

Reading the Numbers

Component	Variance Explained	Cumulative
PC1	60%	60%
PC2	25%	85%
PC3	10%	95%
PC4	5%	100%

If you keep PC1, PC2, and PC3, you explain 95% of what’s happening in your data! That’s usually enough.

The Elbow Rule

When plotting variance explained, look for an “elbow” - the point where adding more components doesn’t help much.

graph TD
    A["Plot Variance vs Components"] --> B["Look for the Elbow"]
    B --> C["Stop There!"]
    C --> D[You've Got Enough Information]

Singular Value Decomposition (SVD)

The Engine Behind PCA

SVD is like the engine inside a car. You don’t need to understand every part, but it’s what makes PCA work!

The Simple Idea

SVD breaks ANY data table into three simpler pieces:

Data = U × S × V

Think of it like a recipe:

U = The “how much of each flavor” for each person
S = How strong each flavor is (importance scores)
V = The actual flavors (the patterns we found)

Why SVD is Cool

Feature	Benefit
Works on ANY data	Even messy tables!
Finds hidden patterns	Like finding themes in stories
Powers recommendations	Netflix uses this!
Compresses images	Keep quality, reduce size

Netflix Example

SVD on movie ratings might discover:

Pattern 1: People who like Action also like Sci-Fi
Pattern 2: People who like Romance also like Drama

Then Netflix can predict what YOU might like based on a few ratings!

t-SNE Visualization

The Neighborhood Detective

t-SNE (t-distributed Stochastic Neighbor Embedding) is different from PCA. It cares about one thing: keeping neighbors close.

Imagine moving from a 3D world to a 2D paper. t-SNE promises:

“If two points were close before, they’ll be close after!”

How t-SNE Thinks

In the original space, measure how “close” each point is to others
In 2D, try to keep those same closeness relationships
Iterate until it looks good!

graph TD
    A["High-Dimensional Data"] --> B["Calculate Neighbor Distances"]
    B --> C["Create 2D Map"]
    C --> D["Adjust Until Neighbors Match"]
    D --> E["Beautiful Clusters Appear!"]

When to Use t-SNE

Great For	Not Great For
Seeing clusters	Exact distances
Exploring data	Making predictions
Finding groups	Very large datasets

Important t-SNE Rules

Perplexity parameter = roughly “how many neighbors to consider”
Different runs give different pictures (it’s stochastic!)
Don’t trust cluster sizes (they can be misleading)

UMAP Visualization

The Faster, Smarter Cousin

UMAP (Uniform Manifold Approximation and Projection) is like t-SNE’s athletic cousin. It does similar things but:

Runs much faster
Preserves global structure better
Scales to millions of points

The Simple Idea

UMAP assumes your data lives on a curved surface (manifold) in high dimensions. It tries to unfold that surface onto a flat 2D map.

Think of it like unfolding a crumpled paper ball - you want to see everything flat without tearing it!

graph TD
    A["Data on Curved Surface"] --> B["Build Local Connections"]
    B --> C["Optimize 2D Layout"]
    C --> D["Preserve Both Local and Global Structure"]

t-SNE vs UMAP

Feature	t-SNE	UMAP
Speed	Slow	Fast
Global Structure	Limited	Good
Parameters	Fewer	More
Large Data	Struggles	Handles well

Key UMAP Parameters

n_neighbors: How many nearby points to consider (like perplexity in t-SNE)
min_dist: How tightly packed points can be

Small n_neighbors = Focus on tiny local clusters Large n_neighbors = See bigger picture

Putting It All Together

Your Dimensionality Reduction Toolkit

graph TD
    A["High-Dimensional Data"] --> B{What Do You Need?}
    B -->|Reduce for ML| C["PCA/SVD"]
    B -->|Visualize Clusters| D{Data Size?}
    D -->|Small < 10k| E["t-SNE"]
    D -->|Large > 10k| F["UMAP"]
    C --> G["Feed to Algorithm"]
    E --> H["Explore Patterns"]
    F --> H

The Journey Summary

Why Reduce? - Too many dimensions = slow and confused
PCA - Find the best “viewing angles” for your data
Variance Explained - Know how much information you’re keeping
SVD - The powerful math engine behind it all
t-SNE - Make beautiful 2D pictures, keep neighbors close
UMAP - Faster pictures that show the big picture too

You Did It!

You just learned how data scientists compress entire universes of data into something we can actually see and work with.

Next time you see a cool 2D scatter plot of millions of points, you’ll know: someone used these exact techniques to make it possible!

Remember: These tools are like different cameras. PCA is your reliable everyday camera. t-SNE is your artistic lens. UMAP is your high-speed sports camera. Pick the right one for the job!

Dimensionality Reduction

Unable to load concept

Coming Soon...

Dimensionality Reduction: Squeezing a Universe Into a Pocket

The Story of Too Many Crayons

Why Reduce Dimensions?

The Curse of Too Much

The Magic Solution

Principal Component Analysis (PCA)

The Photographer’s Trick

How PCA Works (Simple Version)

Real Example

Variance Explained

The Importance Score

Reading the Numbers

The Elbow Rule

Singular Value Decomposition (SVD)

The Engine Behind PCA

The Simple Idea

Why SVD is Cool

Netflix Example

t-SNE Visualization

The Neighborhood Detective

How t-SNE Thinks

When to Use t-SNE

Important t-SNE Rules

UMAP Visualization

The Faster, Smarter Cousin

The Simple Idea

t-SNE vs UMAP

Key UMAP Parameters

Putting It All Together

Your Dimensionality Reduction Toolkit

The Journey Summary

You Did It!

Story - Premium Content

Stay Tuned!

Story - Premium Content

Interactive - Premium Content

Interactive - Premium Content

Stay Tuned!

Cheatsheet - Premium Content

Cheatsheet - Premium Content

Stay Tuned!

Quiz - Premium Content

Quiz - Premium Content

Stay Tuned!

Flashcard - Premium Content

Flashcard - Premium Content

Stay Tuned!

Sign in Required

Report an Issue