What is a regular expression in Python?

A regular expression (regex) is a pattern you write to find text. Python's re module lets you search, match, and replace text patterns.

What are the main regex functions in Python?

Python has four main regex functions: search() finds first match, match() checks the start, findall() returns all matches, and sub() replaces text.

What is the difference between greedy and non-greedy regex?

Greedy regex takes as much as possible. Non-greedy (lazy) takes the minimum needed. Add ? after quantifiers like *? or +? for non-greedy.

Regular Expressions in Python | Regex Guide

🔍 Regular Expressions: The Secret Code Finder

Imagine you have a magical magnifying glass that can find ANY pattern in a mountain of text. That’s regex!

🎭 The Story: Meet Detective Regex

You’re a detective. Your job? Finding specific patterns in huge piles of letters, emails, and documents.

Without regex, you’d read every single word. Boring!

With regex, you write a magic search spell and—BOOM—every match lights up instantly.

Let’s learn this superpower!

📚 What We’ll Learn

Regex Basics
Pattern Matching Functions
Match Objects and Groups
Metacharacters
Quantifiers and Anchors
Greedy vs Non-Greedy
Regex Flags

1️⃣ Regex Basics

What is Regex?

Regex = Regular Expression = A pattern you write to find text.

Think of it like a treasure map. The pattern is your map. The text is the jungle. Regex finds the treasure!

Your First Regex in Python

import re

text = "I love cats and dogs"
pattern = "cats"

result = re.search(pattern, text)
print(result)  # Found it!

What happened?

We imported the re module (Python’s regex tool)
We wrote a simple pattern: "cats"
re.search() found “cats” in our text

The `r` Prefix (Raw Strings)

Always use r before your pattern:

pattern = r"\d+"  # Good!
pattern = "\d+"   # Risky!

Why? The r tells Python: “Don’t mess with my backslashes!”

2️⃣ Pattern Matching Functions

Python gives us 4 main tools:

`re.search()` - Find First Match

import re

text = "Call me at 555-1234"
match = re.search(r"\d+", text)

if match:
    print(match.group())  # 555

Finds the first number in the text.

`re.match()` - Check the Beginning

text = "Hello World"

# This works (starts with Hello)
re.match(r"Hello", text)  # ✓

# This fails (World is not at start)
re.match(r"World", text)  # ✗

match() only looks at the beginning!

`re.findall()` - Find ALL Matches

text = "I have 2 cats and 3 dogs"
numbers = re.findall(r"\d", text)

print(numbers)  # ['2', '3']

Returns a list of all matches!

`re.sub()` - Find and Replace

text = "I hate Mondays"
new_text = re.sub(r"hate", "love", text)

print(new_text)  # I love Mondays

Like Find-Replace in your text editor!

3️⃣ Match Objects and Groups

What’s a Match Object?

When regex finds something, it creates a Match Object—a little package of info.

text = "My email is bob@mail.com"
match = re.search(r"\w+@\w+\.\w+", text)

if match:
    print(match.group())  # bob@mail.com
    print(match.start())  # 12 (where it starts)
    print(match.end())    # 24 (where it ends)
    print(match.span())   # (12, 24)

Groups: Capture Parts

Use parentheses () to capture pieces:

text = "Born on 2005-03-15"
pattern = r"(\d{4})-(\d{2})-(\d{2})"

match = re.search(pattern, text)

if match:
    print(match.group(0))  # 2005-03-15 (full)
    print(match.group(1))  # 2005 (year)
    print(match.group(2))  # 03 (month)
    print(match.group(3))  # 15 (day)

Think of it like boxes inside boxes!

┌────────────────────────┐
│   Full Match (group 0) │
│  ┌─────┐ ┌────┐ ┌────┐ │
│  │2005 │ │ 03 │ │ 15 │ │
│  │ (1) │ │(2) │ │(3) │ │
│  └─────┘ └────┘ └────┘ │
└────────────────────────┘

Named Groups

Give your groups names for clarity:

pattern = r"(?P<year>\d{4})-(?P<month>\d{2})"
match = re.search(pattern, "Date: 2024-08")

print(match.group('year'))   # 2024
print(match.group('month'))  # 08

4️⃣ Metacharacters

Metacharacters are magic symbols with special powers:

The Dot `.` - Match Any Character

re.findall(r"c.t", "cat cot cut")
# ['cat', 'cot', 'cut']

The . matches any single character!

Character Classes `[]`

# Match a, e, i, o, or u
re.findall(r"[aeiou]", "hello")
# ['e', 'o']

# Match any digit
re.findall(r"[0-9]", "abc123")
# ['1', '2', '3']

Negation `[^]`

# Match anything EXCEPT vowels
re.findall(r"[^aeiou]", "hello")
# ['h', 'l', 'l']

Shorthand Classes

Symbol	Meaning	Same As
`\d`	Any digit	`[0-9]`
`\D`	Not a digit	`[^0-9]`
`\w`	Word character	`[a-zA-Z0-9_]`
`\W`	Not word char	`[^a-zA-Z0-9_]`
`\s`	Whitespace	`[ \t\n\r]`
`\S`	Not whitespace	`[^ \t\n\r]`

text = "Call 555-1234 now!"

re.findall(r"\d", text)  # ['5','5','5','1'...]
re.findall(r"\w+", text) # ['Call','555','1234','now']

The Pipe `|` - OR

re.findall(r"cat|dog", "I have a cat and dog")
# ['cat', 'dog']

5️⃣ Quantifiers and Anchors

Quantifiers: How Many?

Symbol	Meaning	Example
`*`	0 or more	`a*` → “”, “a”, “aaa”
`+`	1 or more	`a+` → “a”, “aaa”
`?`	0 or 1	`a?` → “”, “a”
`{n}`	Exactly n	`a{3}` → “aaa”
`{n,}`	n or more	`a{2,}` → “aa”, “aaa”
`{n,m}`	n to m	`a{2,4}` → “aa”, “aaa”

text = "goood morning gooooood day"

re.findall(r"go+d", text)
# ['goood', 'gooooood']

re.findall(r"go{2,4}d", text)
# ['goood'] (only 2-4 o's)

Anchors: Where to Look?

Symbol	Meaning
`^`	Start of string
`$`	End of string
`\b`	Word boundary

text = "hello world"

re.search(r"^hello", text)  # ✓ Matches
re.search(r"^world", text)  # ✗ No match

re.search(r"worldquot;, text)  # ✓ Matches
re.search(r"helloquot;, text)  # ✗ No match

Word Boundaries:

text = "cat category caterpillar"

re.findall(r"\bcat\b", text)
# ['cat'] - only the standalone word!

re.findall(r"cat", text)
# ['cat', 'cat', 'cat'] - all occurrences

6️⃣ Greedy vs Non-Greedy

The Hungry Monster (Greedy)

By default, regex is GREEDY. It wants as much as possible!

text = "<h1>Title</h1><p>Text</p>"

# Greedy (default)
re.findall(r"<.*>", text)
# ['<h1>Title</h1><p>Text</p>']
# Ate EVERYTHING between first < and last >

The Polite Monster (Non-Greedy)

Add ? after a quantifier to make it lazy:

# Non-greedy
re.findall(r"<.*?>", text)
# ['<h1>', '</h1>', '<p>', '</p>']
# Takes minimum needed!

Visual Comparison

Text: <b>bold</b>

Greedy  <.*>  : <────────────>
               <b>bold</b>

Lazy    <.*?> : <──>   <───>
               <b>   </b>

All Non-Greedy Versions

Greedy	Non-Greedy
`*`	`*?`
`+`	`+?`
`?`	`??`
`{n,m}`	`{n,m}?`

7️⃣ Regex Flags

Flags change how your pattern works:

`re.IGNORECASE` (or `re.I`)

text = "Hello HELLO hello"

re.findall(r"hello", text)
# ['hello']

re.findall(r"hello", text, re.I)
# ['Hello', 'HELLO', 'hello']

`re.MULTILINE` (or `re.M`)

Makes ^ and $ work on each line:

text = """Line 1
Line 2
Line 3"""

re.findall(r"^Line", text)
# ['Line'] - only first line

re.findall(r"^Line", text, re.M)
# ['Line', 'Line', 'Line'] - all lines!

`re.DOTALL` (or `re.S`)

Makes . match newlines too:

text = "Hello\nWorld"

re.search(r"Hello.World", text)    # ✗ No match
re.search(r"Hello.World", text, re.S)  # ✓ Match!

`re.VERBOSE` (or `re.X`)

Write readable patterns with comments:

pattern = r"""
    \d{3}    # Area code
    -        # Separator
    \d{4}    # Phone number
"""

re.search(pattern, "555-1234", re.X)

Combining Flags

Use the | operator:

re.findall(r"hello", text, re.I | re.M)

🏁 Quick Reference Flow

graph TD
    A["Start"] --> B{What do you need?}
    B --> C["Find first match"]
    C --> D["re.search"]
    B --> E["Check start only"]
    E --> F["re.match"]
    B --> G["Find all matches"]
    G --> H["re.findall"]
    B --> I["Replace text"]
    I --> J["re.sub"]

🎯 Real-World Examples

Validate an Email

pattern = r"^[\w.-]+@[\w.-]+\.\w+quot;

re.match(pattern, "user@email.com")  # ✓
re.match(pattern, "bad-email")       # ✗

Extract Phone Numbers

text = "Call 555-123-4567 or 999-876-5432"
pattern = r"\d{3}-\d{3}-\d{4}"

re.findall(pattern, text)
# ['555-123-4567', '999-876-5432']

Clean Extra Spaces

text = "Too   many    spaces"
clean = re.sub(r"\s+", " ", text)

print(clean)  # "Too many spaces"

🌟 You Did It!

You now have regex superpowers!

Remember:

🔍 search() finds first
📋 findall() finds all
🔄 sub() replaces
📦 Groups () capture parts
⚡ Flags change behavior

Practice makes perfect. Try building patterns for:

URLs
Dates
Usernames
Hashtags

Happy pattern hunting! 🎉

Regular Expressions

Unable to load concept

Coming Soon...

🔍 Regular Expressions: The Secret Code Finder

🎭 The Story: Meet Detective Regex

📚 What We’ll Learn

1️⃣ Regex Basics

What is Regex?

Your First Regex in Python

The r Prefix (Raw Strings)

2️⃣ Pattern Matching Functions

re.search() - Find First Match

re.match() - Check the Beginning

re.findall() - Find ALL Matches

re.sub() - Find and Replace

3️⃣ Match Objects and Groups

What’s a Match Object?

Groups: Capture Parts

Named Groups

4️⃣ Metacharacters

The Dot . - Match Any Character

Character Classes []

Negation [^]

Shorthand Classes

The Pipe | - OR

5️⃣ Quantifiers and Anchors

Quantifiers: How Many?

Anchors: Where to Look?

6️⃣ Greedy vs Non-Greedy

The Hungry Monster (Greedy)

The Polite Monster (Non-Greedy)

Visual Comparison

All Non-Greedy Versions

7️⃣ Regex Flags

re.IGNORECASE (or re.I)

re.MULTILINE (or re.M)

re.DOTALL (or re.S)

re.VERBOSE (or re.X)

Combining Flags

🏁 Quick Reference Flow

🎯 Real-World Examples

Validate an Email

Extract Phone Numbers

Clean Extra Spaces

🌟 You Did It!

Story - Premium Content

Stay Tuned!

Story - Premium Content

Interactive - Premium Content

Interactive - Premium Content

Stay Tuned!

Cheatsheet - Premium Content

Cheatsheet - Premium Content

Stay Tuned!

Quiz - Premium Content

Quiz - Premium Content

Stay Tuned!

Flashcard - Premium Content

Flashcard - Premium Content

Stay Tuned!

Sign in Required

Report an Issue

The `r` Prefix (Raw Strings)

`re.search()` - Find First Match

`re.match()` - Check the Beginning

`re.findall()` - Find ALL Matches

`re.sub()` - Find and Replace

The Dot `.` - Match Any Character

Character Classes `[]`

Negation `[^]`

The Pipe `|` - OR

`re.IGNORECASE` (or `re.I`)

`re.MULTILINE` (or `re.M`)

`re.DOTALL` (or `re.S`)

`re.VERBOSE` (or `re.X`)