🪟 Browser Windows & Screenshots in Selenium
Your Browser is a House with Many Rooms!
🏠 The Big Picture
Imagine your browser is like a big house. Each window or tab is like a room in that house. Selenium is like a smart robot helper that can walk around the house, go into different rooms, and even take pictures!
Today, we’ll learn how to:
- 🏷️ Window Handles – Give each room a name tag
- 🚶 Multi-Window Navigation – Walk between rooms
- 📸 Full Page Screenshots – Take a photo of the whole room
- 🎯 Element Screenshots – Take a photo of just one thing in the room
🏷️ Window Handles – Every Room Has a Name Tag
What is a Window Handle?
Think of it like this: You’re at a party with many people. How do you know who is who? Name tags!
In Selenium, every browser window or tab gets a special name tag called a “window handle”. It’s a unique string like CDwindow-ABC123.
graph TD A[🌐 Browser] --> B[🪟 Window 1<br/>Handle: ABC123] A --> C[🪟 Window 2<br/>Handle: XYZ789] A --> D[🪟 Window 3<br/>Handle: DEF456]
How to Get the Current Window Handle
# Get the name tag of the room
# you're currently standing in
current_handle = driver.current_window_handle
print(current_handle)
# Output: CDwindow-ABC123
How to Get ALL Window Handles
# Get name tags of ALL rooms in the house
all_handles = driver.window_handles
print(all_handles)
# Output: ['ABC123', 'XYZ789', 'DEF456']
💡 Key Points
current_window_handle= ONE name tag (the room you’re in)window_handles= ALL name tags (list of all rooms)- Each handle is unique – no two windows share the same handle
🚶 Multi-Window Navigation – Walking Between Rooms
The Problem
You clicked a link and a new window opened. But wait – your robot (Selenium) is still standing in the old room! It can’t see what’s in the new room.
The Solution
Tell Selenium to walk to the new room using switch_to.window().
graph LR A[🤖 Robot in Window 1] -->|switch_to.window| B[🤖 Robot in Window 2]
Step-by-Step Example
Scenario: Click a button that opens a new window, then work with the new window.
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://example.com")
# Step 1: Save the "home room" name tag
original_window = driver.current_window_handle
# Step 2: Click something that opens
# a new window
driver.find_element("id", "open-new").click()
# Step 3: Wait for the new window to appear
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
WebDriverWait(driver, 10).until(
EC.number_of_windows_to_be(2)
)
# Step 4: Find the new window and go there
for handle in driver.window_handles:
if handle != original_window:
driver.switch_to.window(handle)
break
# Now you're in the new window!
print(driver.title) # Prints new window's title
Going Back to the Original Window
# Walk back to the home room
driver.switch_to.window(original_window)
print(driver.title) # Back to original!
🎯 Real-Life Analogy
Imagine you’re playing a video game with multiple save files:
current_window_handle= Which save file is active nowwindow_handles= List of all your save filesswitch_to.window()= Load a different save file
📸 Full Page Screenshots – Snap the Whole Room!
What is a Full Page Screenshot?
It’s like taking a panoramic photo of an entire room. Everything visible on the page gets captured in one image!
graph TD A[🌐 Webpage] -->|Screenshot| B[📸 Image File] B --> C[🖼️ header.png<br/>or screenshot.png]
Simple Example
# Take a picture and save it
driver.save_screenshot("my_screenshot.png")
That’s it! One line of code and you have a picture of the whole page.
Get Screenshot as Raw Data
Sometimes you want the image data directly (not saved to a file):
# Get as base64 string
screenshot_base64 = driver.get_screenshot_as_base64()
# Get as raw bytes
screenshot_bytes = driver.get_screenshot_as_png()
💡 When to Use Full Page Screenshots?
| Situation | Why Screenshot Helps |
|---|---|
| Test failed | See what went wrong |
| Visual comparison | Compare designs |
| Bug reports | Show the problem |
| Documentation | Record app states |
Pro Tip: Add Timestamps
import datetime
timestamp = datetime.datetime.now()
filename = timestamp.strftime("%Y%m%d_%H%M%S")
driver.save_screenshot(f"screenshot_{filename}.png")
# Creates: screenshot_20241215_143022.png
🎯 Element Screenshots – Zoom In on One Thing!
What is an Element Screenshot?
Instead of photographing the whole room, you photograph just one toy on the shelf – a specific button, image, or text area.
graph LR A[🌐 Full Page] -->|Find Element| B[🔲 Button] B -->|Screenshot| C[📸 button.png]
Simple Example
# Step 1: Find the specific element
logo = driver.find_element("id", "site-logo")
# Step 2: Take a screenshot of JUST that element
logo.screenshot("logo_only.png")
Real-World Scenario
You want to screenshot just the login form on a page:
from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.get("https://example.com/login")
# Find the login form
login_form = driver.find_element(By.ID, "login-form")
# Screenshot just the form
login_form.screenshot("login_form.png")
💡 Why Use Element Screenshots?
| Full Page Screenshot | Element Screenshot |
|---|---|
| Captures everything | Captures one thing |
| Bigger file size | Smaller file size |
| More context | More focused |
| Good for full layout | Good for specific checks |
Pro Tip: Combine with Assertions
# Take screenshot before and after an action
button = driver.find_element(By.ID, "submit-btn")
button.screenshot("button_before.png")
button.click()
# After state changed
button.screenshot("button_after.png")
🎮 Putting It All Together
Here’s a complete example that uses ALL the concepts:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
driver = webdriver.Chrome()
# 1. Open main page
driver.get("https://example.com")
main_window = driver.current_window_handle
# 2. Take full page screenshot
driver.save_screenshot("main_page.png")
# 3. Screenshot a specific element
header = driver.find_element(By.TAG_NAME, "header")
header.screenshot("header_only.png")
# 4. Click link that opens new window
driver.find_element(By.LINK_TEXT, "Open Popup").click()
# 5. Wait for new window
WebDriverWait(driver, 10).until(
EC.number_of_windows_to_be(2)
)
# 6. Switch to new window
for handle in driver.window_handles:
if handle != main_window:
driver.switch_to.window(handle)
break
# 7. Screenshot the popup
driver.save_screenshot("popup_page.png")
# 8. Go back to main window
driver.switch_to.window(main_window)
# 9. Verify we're back
print(f"Back to: {driver.title}")
driver.quit()
🧠 Quick Memory Tricks
| Concept | Think of it as… |
|---|---|
| Window Handle | Room name tag |
current_window_handle |
“Which room am I in?” |
window_handles |
“List all rooms” |
switch_to.window() |
“Walk to that room” |
save_screenshot() |
“Photo of whole room” |
element.screenshot() |
“Photo of one toy” |
⚠️ Common Mistakes to Avoid
Mistake 1: Forgetting to Switch
# ❌ Wrong: Robot is still in old window!
driver.find_element("id", "popup-btn").click()
print(driver.title) # Still shows OLD title
# ✅ Right: Switch first!
driver.switch_to.window(new_handle)
print(driver.title) # Shows NEW title
Mistake 2: Using Wrong Handle
# ❌ Wrong: Hard-coding handle names
driver.switch_to.window("window1")
# ✅ Right: Get handles dynamically
handles = driver.window_handles
driver.switch_to.window(handles[-1]) # Last opened
Mistake 3: Not Waiting for Window
# ❌ Wrong: May fail if window not ready
for h in driver.window_handles:
...
# ✅ Right: Wait for window to appear
WebDriverWait(driver, 10).until(
EC.number_of_windows_to_be(2)
)
🚀 You Did It!
Now you know how to:
- ✅ Get and use window handles
- ✅ Navigate between multiple windows
- ✅ Take full page screenshots
- ✅ Take element-specific screenshots
You’re now a Window & Screenshot Master in Selenium! 🎉
Remember: Your browser is a house, windows are rooms, handles are name tags, and screenshots are photos. Simple!