A GAN has two networks: a Generator that creates fake images and a Discriminator that detects fakes. They compete and both improve.

How does a VAE differ from a regular autoencoder?

A VAE adds randomness using probability distributions. Instead of one code, it learns a cloud of possibilities to create variations.

Generative Models in PyTorch | GANs & VAEs

Q: What is an Autoencoder?

An autoencoder compresses images into a small code (encoding) then rebuilds them (decoding). It learns to keep only the important parts.

🎨 Generative Models: Teaching AI to Create Art!

Imagine This Story…

Picture a magic art school where robots learn to paint! 🤖🎨

There are two types of robot students:

Compressor Robots (Autoencoders) - They learn to make tiny copies of pictures
Artist Robots (GANs) - They learn to paint brand new pictures from imagination

Let’s visit this magical school and see how they learn!

🗜️ Chapter 1: The Compressor Robot (Autoencoder)

What is an Autoencoder?

Imagine you have a big fluffy teddy bear and a tiny box. Can you fit the teddy in the box?

An autoencoder is like a robot that:

Squishes the teddy bear really small (this is called encoding)
Stretches it back to normal size (this is called decoding)

The goal? Make the teddy look EXACTLY the same after squishing and stretching!

How It Works

Big Picture → [Squish!] → Tiny Code → [Stretch!] → Big Picture Again

graph TD
    A["🖼️ Input Image"] --> B["📦 Encoder"]
    B --> C["💎 Latent Code"]
    C --> D["📤 Decoder"]
    D --> E["🖼️ Rebuilt Image"]

    style C fill:#FFD700

PyTorch Example

import torch.nn as nn

class Autoencoder(nn.Module):
    def __init__(self):
        super().__init__()
        # Squish: 784 → 32 numbers
        self.encoder = nn.Sequential(
            nn.Linear(784, 128),
            nn.ReLU(),
            nn.Linear(128, 32)
        )
        # Stretch: 32 → 784 numbers
        self.decoder = nn.Sequential(
            nn.Linear(32, 128),
            nn.ReLU(),
            nn.Linear(128, 784)
        )

    def forward(self, x):
        code = self.encoder(x)
        return self.decoder(code)

What’s happening?

784 = picture pixels (28×28 image)
32 = tiny code (much smaller!)
The robot learns to keep only the IMPORTANT parts

🎲 Chapter 2: The Lucky Dice Robot (Variational Autoencoder)

What Makes VAE Special?

Remember our squishing robot? The Variational Autoencoder (VAE) is its creative cousin!

Instead of making ONE tiny code, it makes a cloud of possibilities! ☁️

Think of it like this:

Regular Autoencoder: “This cat picture becomes code [5, 3, 2]”
VAE: “This cat picture is SOMEWHERE around [5, 3, 2] - let me roll dice to pick!”

Why Dice? (Probability Distributions)

Imagine you want to draw “a cat.” There are MANY ways to draw cats:

Fat cats, thin cats
Orange cats, black cats
Sleeping cats, jumping cats

The VAE learns: “What does the AVERAGE cat look like? How much can cats vary?”

graph TD
    A["🐱 Cat Picture"] --> B["📊 Learn Average"]
    A --> C["📊 Learn Variation"]
    B --> D["🎲 Roll Dice"]
    C --> D
    D --> E["💎 Random Code"]
    E --> F["🐱 New Cat!"]

    style D fill:#FF6B6B

The Magic Numbers

VAE learns two things for each feature:

μ (mu) = The average (center of the cloud)
σ (sigma) = How spread out (size of cloud)

PyTorch Example

class VAE(nn.Module):
    def __init__(self):
        super().__init__()
        self.encoder = nn.Linear(784, 256)
        # Two outputs: mean and variance
        self.fc_mu = nn.Linear(256, 32)
        self.fc_var = nn.Linear(256, 32)
        self.decoder = nn.Linear(32, 784)

    def encode(self, x):
        h = F.relu(self.encoder(x))
        return self.fc_mu(h), self.fc_var(h)

    def reparameterize(self, mu, var):
        # Roll the dice!
        std = torch.exp(0.5 * var)
        eps = torch.randn_like(std)
        return mu + eps * std

The Reparameterize Trick: We add randomness so the robot can CREATE new things, not just copy!

🎭 Chapter 3: The Art Competition (GANs)

What is a GAN?

GAN = Generative Adversarial Network

Imagine TWO robots having a competition:

🎨 The Artist (Generator): Tries to paint fake pictures 🔍 The Detective (Discriminator): Tries to catch the fakes

They compete and BOTH get better!

The Story

Day 1: Artist paints a blob. Detective: "FAKE! Obviously!"

Day 100: Artist paints almost-real face. Detective: "Hmm... 50% sure it's fake..."

Day 1000: Artist paints PERFECT face. Detective: "I... can't tell anymore!"

graph TD
    A["🎲 Random Noise"] --> B["🎨 Generator"]
    B --> C["🖼️ Fake Image"]
    D["📸 Real Image"] --> E["🔍 Discriminator"]
    C --> E
    E --> F{Real or Fake?}
    F --> |Wrong!| G["📚 Both Learn"]
    G --> B
    G --> E

    style B fill:#4ECDC4
    style E fill:#FF6B6B

🎨 Chapter 4: The Artist Robot (Generator)

What Does the Generator Do?

The Generator is like a dream painter:

Input: Random numbers (like rolling dice)
Output: A complete picture!

It starts making ugly blobs, but gets better every day!

PyTorch Example

class Generator(nn.Module):
    def __init__(self):
        super().__init__()
        self.model = nn.Sequential(
            # Start with 100 random numbers
            nn.Linear(100, 256),
            nn.LeakyReLU(0.2),
            nn.Linear(256, 512),
            nn.LeakyReLU(0.2),
            nn.Linear(512, 784),
            nn.Tanh()  # Output: -1 to 1
        )

    def forward(self, noise):
        return self.model(noise)

# Create a fake image!
noise = torch.randn(1, 100)
fake_image = generator(noise)

What’s happening?

100 random numbers → Generator → 784 pixel image
LeakyReLU: Helps the robot learn better
Tanh: Makes pixels between -1 and 1

🔍 Chapter 5: The Detective Robot (Discriminator)

What Does the Discriminator Do?

The Discriminator is like a art expert:

Input: Any picture (real OR fake)
Output: “I’m X% sure this is REAL”

PyTorch Example

class Discriminator(nn.Module):
    def __init__(self):
        super().__init__()
        self.model = nn.Sequential(
            nn.Linear(784, 512),
            nn.LeakyReLU(0.2),
            nn.Linear(512, 256),
            nn.LeakyReLU(0.2),
            nn.Linear(256, 1),
            nn.Sigmoid()  # Output: 0 to 1
        )

    def forward(self, image):
        return self.model(image)

# Check if image is real
score = discriminator(image)
# score = 0.9 means "90% sure it's real!"

What’s happening?

Takes 784 pixels
Outputs ONE number between 0 and 1
0 = “Definitely FAKE!”
1 = “Definitely REAL!”

🏋️ Chapter 6: Training the Competition

The Training Dance

Training a GAN is like a dance between two partners:

Step 1: Train the Detective

Show it REAL pictures → Should say “REAL!” (score = 1)
Show it FAKE pictures → Should say “FAKE!” (score = 0)

Step 2: Train the Artist

Make fake pictures
Try to FOOL the detective (want score = 1 for fakes!)

The Loss Functions

# For Discriminator
real_loss = -torch.log(D(real_image))
fake_loss = -torch.log(1 - D(G(noise)))
d_loss = real_loss + fake_loss

# For Generator
g_loss = -torch.log(D(G(noise)))

In simple words:

Detective wants: High score for real, low for fake
Artist wants: Detective to give HIGH score to fakes!

Full Training Loop

for epoch in range(num_epochs):
    for real_images in dataloader:

        # === Train Discriminator ===
        noise = torch.randn(batch_size, 100)
        fake_images = generator(noise)

        # Real images should score high
        real_scores = discriminator(real_images)
        # Fake images should score low
        fake_scores = discriminator(fake_images)

        d_loss = -torch.mean(
            torch.log(real_scores) +
            torch.log(1 - fake_scores)
        )

        optimizer_d.zero_grad()
        d_loss.backward()
        optimizer_d.step()

        # === Train Generator ===
        noise = torch.randn(batch_size, 100)
        fake_images = generator(noise)

        # Want discriminator to think
        # fakes are real!
        scores = discriminator(fake_images)
        g_loss = -torch.mean(torch.log(scores))

        optimizer_g.zero_grad()
        g_loss.backward()
        optimizer_g.step()

🎯 The Big Picture

graph TD
    subgraph "Autoencoders"
        A1["Regular AE"] --> A2["Compress &amp; Rebuild"]
        A3["VAE"] --> A4["Add Randomness&lt;br&gt;to Create New!"]
    end

    subgraph "GANs"
        G1["Generator"] --> G2["Creates from Noise"]
        G3["Discriminator"] --> G4["Judges Real vs Fake"]
        G2 -.->|compete| G4
    end

    style A1 fill:#4ECDC4
    style A3 fill:#FFD700
    style G1 fill:#FF6B6B
    style G3 fill:#667EEA

🌟 Key Takeaways

Model	Superpower	Best For
Autoencoder	Compress & rebuild	Removing noise, compression
VAE	Create variations	Generating similar images
GAN	Create realistic new images	Faces, art, deepfakes

🎉 You Did It!

Now you know how AI learns to be creative! These robots can:

🎨 Paint faces that don’t exist
🖼️ Create art in any style
📸 Fill in missing parts of photos
🎬 Even create deepfake videos!

Remember: The magic is in the competition (GANs) and randomness (VAE)!

“The best artists steal… and the best AI learns to create!” 🤖✨

Generative Models

Unable to load concept

Coming Soon...

🎨 Generative Models: Teaching AI to Create Art!

Imagine This Story…

🗜️ Chapter 1: The Compressor Robot (Autoencoder)

What is an Autoencoder?

How It Works

PyTorch Example

🎲 Chapter 2: The Lucky Dice Robot (Variational Autoencoder)

What Makes VAE Special?

Why Dice? (Probability Distributions)

The Magic Numbers

PyTorch Example

🎭 Chapter 3: The Art Competition (GANs)

What is a GAN?

The Story

🎨 Chapter 4: The Artist Robot (Generator)

What Does the Generator Do?

PyTorch Example

🔍 Chapter 5: The Detective Robot (Discriminator)

What Does the Discriminator Do?

PyTorch Example

🏋️ Chapter 6: Training the Competition

The Training Dance

The Loss Functions

Full Training Loop

🎯 The Big Picture

🌟 Key Takeaways

🎉 You Did It!

Story - Premium Content

Stay Tuned!

Story - Premium Content

Interactive - Premium Content

Interactive - Premium Content

Stay Tuned!

Cheatsheet - Premium Content

Cheatsheet - Premium Content

Stay Tuned!

Quiz - Premium Content

Quiz - Premium Content

Stay Tuned!

Flashcard - Premium Content

Flashcard - Premium Content

Stay Tuned!

Sign in Required

Report an Issue