A GAN is two neural networks competing: a Generator creates fake images while a Discriminator tries to detect them. Their rivalry produces realistic outputs.

What is mode collapse in GANs?

Mode collapse happens when the Generator finds one trick that works and only produces that output. It stops learning variety and creates similar images.

How do you evaluate GAN quality?

GANs are evaluated using Inception Score (IS) for quality and diversity, and Frechet Inception Distance (FID) to measure similarity to real images.

What are popular GAN variants?

Popular variants include DCGAN for stable training, StyleGAN for ultra-realistic faces, CycleGAN for unpaired translation, and Pix2Pix for image-to-image.

GAN Architecture | Machine Learning Guide

GAN Architecture: The Art Forger and the Detective

The Story of Two Rivals Who Make Each Other Better

Imagine a world where a clever Art Forger and a sharp-eyed Detective are locked in an endless game. The Forger keeps making fake paintings, and the Detective keeps trying to catch them. Over time, something magical happens: the Forger gets SO good that even experts can’t tell fake from real!

This is exactly how Generative Adversarial Networks (GANs) work. Two neural networks compete, and their rivalry creates something amazing: machines that can generate realistic images, music, and more!

GAN Architecture Overview

What is a GAN?

A GAN is like a game between two players:

Player	Role	Goal
🎨 Generator	Art Forger	Make fakes so good they fool everyone
🔍 Discriminator	Detective	Spot which art is real vs fake

Simple Example:

The Forger (Generator) draws a cat picture
The Detective (Discriminator) looks at it and says “FAKE!”
The Forger learns from this and draws a better cat
They keep playing until the Detective can’t tell the difference

Real-Life Uses:

Creating realistic human faces that don’t exist
Turning sketches into photos
Making old movies look new (upscaling)

graph TD
    A["Random Noise"] --> B["Generator"]
    B --> C["Fake Image"]
    D["Real Image"] --> E["Discriminator"]
    C --> E
    E --> F{Real or Fake?}
    F --> G["Feedback to Generator"]

The Generator Network

The Art Forger’s Studio

The Generator is our Art Forger. It starts with random noise (like TV static) and transforms it into something meaningful!

How It Works:

Takes random numbers as input (the “inspiration”)
Passes them through layers of neurons
Each layer adds more details
Final output: a complete image!

Think of it like this:

Layer 1: “This will be a face”
Layer 2: “Add two eyes here”
Layer 3: “Shape the nose”
Layer 4: “Add skin texture”
Final: A realistic face!

Generator Architecture

Random Noise (100 numbers)
        ↓
   Dense Layer (reshape)
        ↓
   Upsample + Conv (8x8)
        ↓
   Upsample + Conv (16x16)
        ↓
   Upsample + Conv (32x32)
        ↓
   Final Image (64x64)

Key Parts:

Dense layers: Expand the random seed
Upsampling: Make the image bigger
Convolution: Add details and patterns

The Discriminator Network

The Detective’s Magnifying Glass

The Discriminator is our Detective. It looks at images and decides: “Is this REAL or FAKE?”

How It Works:

Takes an image as input
Looks for patterns and details
Compares against what “real” looks like
Outputs a probability: 0% to 100% real

Think of it like this:

“Hmm, the eyes look slightly off… 30% real”
“The skin texture is perfect… 85% real”
“Wait, ears are missing… definitely FAKE!”

Discriminator Architecture

Input Image (64x64)
        ↓
   Conv + Downsample
        ↓
   Conv + Downsample
        ↓
   Conv + Downsample
        ↓
   Flatten + Dense
        ↓
   Output: Real or Fake (0-1)

Key Parts:

Convolution: Detect patterns
Downsampling: Compress information
Final dense layer: Make the decision

Adversarial Training Process

The Epic Battle Begins!

This is where the magic happens! The Generator and Discriminator train together in a never-ending competition.

The Training Loop:

graph TD
    A["Step 1: Generator creates fake images"] --> B["Step 2: Mix fake with real images"]
    B --> C["Step 3: Discriminator tries to classify"]
    C --> D["Step 4: Calculate losses"]
    D --> E["Step 5: Update both networks"]
    E --> A

How Each Player Learns

Step	Generator’s Goal	Discriminator’s Goal
1	Make fakes that fool	Don’t be fooled
2	Minimize “caught” rate	Maximize accuracy
3	Learn from failures	Learn from mistakes

The Beautiful Balance:

If Generator is too weak → Discriminator always wins → Generator improves
If Discriminator is too weak → Generator fools it easily → Discriminator improves
Over time → Both become AMAZING!

Loss Functions Explained Simply

Generator Loss:

“How often did I get caught?” Lower = Better at fooling!

Discriminator Loss:

“How many mistakes did I make?” Lower = Better at detecting!

Mode Collapse Problem

When the Forger Gets Lazy

Imagine our Art Forger discovers that drawing cats with spots always fools the Detective. So they ONLY draw spotted cats. Forever. Nothing else.

This is Mode Collapse — the Generator finds ONE trick that works and refuses to learn anything new!

Signs of Mode Collapse:

Generator produces very similar outputs
Little variety in generated images
Same faces, same poses, same style

Real Example:

GAN trained to generate faces
Starts making ONLY blonde women
Ignores all other face types

Why Does This Happen?

graph TD
    A["Generator finds winning pattern"] --> B[Discriminator can't reject it]
    B --> C["Generator exploits this pattern"]
    C --> D["All outputs look the same"]
    D --> E["Mode Collapse!"]

Solutions to Mode Collapse

Solution	How It Helps
Minibatch discrimination	Forces variety in batches
Feature matching	Focus on statistics, not tricks
Unrolled GANs	Look ahead in training
WGAN	Better loss function

GAN Variants

The GAN Family Tree

Scientists improved the original GAN in many ways. Here are the famous children:

DCGAN (Deep Convolutional GAN)

The First Big Upgrade!

Uses convolutional layers
More stable training
Better image quality

Key Rules:

No pooling layers
Batch normalization everywhere
LeakyReLU activation

Conditional GAN (cGAN)

“I want a specific thing!”

You can tell it WHAT to generate
Example: “Make a cat” vs “Make a dog”

graph LR
    A["Noise + Label"] --> B["Generator"]
    B --> C["Generated Image"]
    C --> D["Discriminator"]
    E["Label"] --> D

Pix2Pix

Image-to-Image Translation

Sketch → Photo
Day → Night
Map → Satellite view

CycleGAN

No Paired Data Needed!

Horse → Zebra
Summer → Winter
Photo → Painting

StyleGAN

The Masterpiece!

Ultra-realistic faces
Control specific features
Mix styles from different images

ProgressiveGAN

Start Small, Grow Big!

Begin with tiny images
Gradually increase resolution
Very stable training

GAN Evaluation Metrics

How Do We Know If Our GAN Is Good?

Judging art is hard! We need special metrics to measure GAN quality.

Inception Score (IS)

“Are the images clear and diverse?”

Score	Meaning
Higher IS	Better quality, more variety
Lower IS	Blurry images or mode collapse

How it works:

Feed generated images to Inception network
Check if predictions are confident (quality)
Check if predictions are varied (diversity)

Fréchet Inception Distance (FID)

“How similar are fake images to real ones?”

Score	Meaning
Lower FID	Closer to real images
Higher FID	Obviously fake

The gold standard for GAN evaluation!

Visual Comparison

Metric	Measures	Good Value
IS	Quality + Diversity	Higher is better
FID	Similarity to real	Lower is better
LPIPS	Perceptual similarity	Depends on task

Human Evaluation

Sometimes the best judge is… a human!

Show real and fake images
Ask: “Which is real?”
If people can’t tell → SUCCESS!

Summary: The Complete Picture

graph TD
    A["GAN Architecture"] --> B["Generator"]
    A --> C["Discriminator"]
    A --> D["Adversarial Training"]

    B --> E["Creates fakes from noise"]
    C --> F["Judges real vs fake"]
    D --> G["Both improve together"]

    G --> H["Challenges"]
    H --> I["Mode Collapse"]

    G --> J["Improvements"]
    J --> K["DCGAN, cGAN, StyleGAN..."]

    G --> L["Evaluation"]
    L --> M["IS, FID, Human tests"]

You Did It! 🎉

Now you understand GANs! Remember:

Generator = Creative artist making fakes
Discriminator = Detective spotting fakes
Training = They battle and both improve
Mode Collapse = When Generator gets lazy
Variants = Different GAN flavors for different tasks
Metrics = How we measure GAN quality

The next time you see an AI-generated face, you’ll know: somewhere, a Generator and Discriminator had an epic battle to create it!

GAN Architecture

Unable to load concept

Coming Soon...

GAN Architecture: The Art Forger and the Detective

The Story of Two Rivals Who Make Each Other Better

GAN Architecture Overview

What is a GAN?

The Generator Network

The Art Forger’s Studio

Generator Architecture

The Discriminator Network

The Detective’s Magnifying Glass

Discriminator Architecture

Adversarial Training Process

The Epic Battle Begins!

How Each Player Learns

Loss Functions Explained Simply

Mode Collapse Problem

When the Forger Gets Lazy

Why Does This Happen?

Solutions to Mode Collapse

GAN Variants

The GAN Family Tree

DCGAN (Deep Convolutional GAN)

Conditional GAN (cGAN)

Pix2Pix

CycleGAN

StyleGAN

ProgressiveGAN

GAN Evaluation Metrics

How Do We Know If Our GAN Is Good?

Inception Score (IS)

Fréchet Inception Distance (FID)

Visual Comparison

Human Evaluation

Summary: The Complete Picture

You Did It! 🎉

Story - Premium Content

Stay Tuned!

Story - Premium Content

Interactive - Premium Content

Interactive - Premium Content

Stay Tuned!

Cheatsheet - Premium Content

Cheatsheet - Premium Content

Stay Tuned!

Quiz - Premium Content

Quiz - Premium Content

Stay Tuned!

Flashcard - Premium Content

Flashcard - Premium Content

Stay Tuned!

Sign in Required

Report an Issue