Text Processing

Back

Loading concept...

C++ Text Processing: Your Magic Toolbox for Words! 🧰

Imagine you have a toy box with special tools. Each tool helps you play with words and sentences in different ways. Today, we’ll explore four amazing tools in C++:

  1. Regular Expressions – The Pattern Detective 🔍
  2. span – The Window Viewer 🪟
  3. string_view – The Photo Frame 🖼️
  4. std::format – The Art Painter 🎨

🔍 Regular Expressions: The Pattern Detective

What Is It?

Think of regular expressions (or “regex”) like a super detective who can find anything you describe!

If you tell the detective: “Find all words that start with ‘cat’”—the detective will find “cat”, “cats”, “caterpillar”, and “catalog”!

The Simple Idea

Regular expressions help you find patterns in text.

  • Want to find all phone numbers? ✅
  • Want to find all email addresses? ✅
  • Want to check if a password is strong? ✅

How Does It Work?

#include <regex>
#include <string>
#include <iostream>

int main() {
    std::string text = "My email is bob@mail.com";
    std::regex pattern(R"(\w+@\w+\.\w+)");

    if (std::regex_search(text, pattern)) {
        std::cout << "Found an email!";
    }
    return 0;
}

Pattern Building Blocks

Symbol Meaning Example
\d Any digit (0-9) \d\d\d finds “123”
\w Any letter, digit, or _ \w+ finds words
. Any single character c.t finds “cat”, “cut”
+ One or more a+ finds “a”, “aaa”
* Zero or more ab* finds “a”, “ab”, “abb”

Real-Life Story

Imagine you’re checking 1000 usernames. You need to make sure each one:

  • Starts with a letter
  • Has only letters and numbers
  • Is 3-15 characters long

Without regex, you’d write dozens of lines. With regex? One line!

std::regex valid_username(R"([a-zA-Z][a-zA-Z0-9]{2,14})");

🪟 span: The Window Viewer

What Is It?

Imagine looking through a window into a toy store. You can see all the toys, but you can’t take them home through the window!

span is like that window. It lets you look at a piece of an array without copying it.

The Simple Idea

A span is a view into a collection. It sees the data but doesn’t own it.

Why Is This Cool?

Think of copying a book vs. looking at a book in the library:

  • Copying = Slow, uses memory
  • Looking = Fast, no extra memory!
#include <span>
#include <vector>
#include <iostream>

void print_items(std::span<int> items) {
    for (int item : items) {
        std::cout << item << " ";
    }
}

int main() {
    std::vector<int> numbers = {1, 2, 3, 4, 5};

    // View the whole vector
    print_items(numbers);

    // View just the first 3
    print_items(std::span(numbers).first(3));

    return 0;
}

The Magic Powers of span

graph TD A["Original Array"] --> B["span - Full View"] B --> C["first 3 - See beginning"] B --> D["last 2 - See ending"] B --> E["subspan - See middle"]

Key Features

Method What It Does
first(n) View first n elements
last(n) View last n elements
subspan(start, count) View middle portion
size() How many items visible
empty() Is the view empty?

🖼️ string_view: The Photo Frame

What Is It?

Imagine you have a photo frame that can show any part of a long picture. You don’t cut the picture—you just frame the part you want to see!

string_view is a photo frame for text. It shows you part of a string without making a copy.

The Simple Idea

string_view lets you peek at text super fast without creating new strings.

Before and After

The OLD way (slow, copies text):

void greet(std::string name) {
    // Creates a COPY of the name
    std::cout << "Hello, " << name;
}

The NEW way (fast, no copy):

void greet(std::string_view name) {
    // Just LOOKS at the name
    std::cout << "Hello, " << name;
}

Why Does This Matter?

Imagine you have a book with 1 million words. Someone asks: “What’s on page 500?”

  • Old way: Photocopy page 500, hand them the copy
  • New way: Point at page 500, let them read it

Which is faster? Pointing! That’s string_view.

Full Example

#include <string_view>
#include <iostream>

void analyze(std::string_view text) {
    std::cout << "Length: " << text.size() << "\n";
    std::cout << "First: " << text.front() << "\n";
    std::cout << "Last: " << text.back() << "\n";
}

int main() {
    // Works with string literals!
    analyze("Hello World");

    // Works with std::string!
    std::string message = "C++ is fun";
    analyze(message);

    return 0;
}

Helpful Methods

Method What It Does
substr(pos, len) View a portion
remove_prefix(n) Skip first n chars
remove_suffix(n) Hide last n chars
starts_with("x") Does it start with “x”?
ends_with("x") Does it end with “x”?

🎨 std::format: The Art Painter

What Is It?

Imagine you’re a painter who wants to write a birthday card. You have:

  • A template: “Happy Birthday, ___! You are ___ years old!”
  • The details: “Emma” and “7”

You paint the details into the blanks!

std::format is your painting tool. It fills in blanks beautifully!

The Simple Idea

std::format takes a template with placeholders {} and fills them with values.

The Old vs. New Way

The OLD confusing way:

std::cout << "Name: " << name << ", Age: "
          << age << ", Score: " << score;

The NEW clean way:

auto text = std::format(
    "Name: {}, Age: {}, Score: {}",
    name, age, score
);

So much cleaner! 🎉

Complete Example

#include <format>
#include <iostream>

int main() {
    std::string name = "Alex";
    int age = 10;
    double score = 95.5;

    // Basic formatting
    auto msg = std::format(
        "Hi {}! You're {} years old!",
        name, age
    );
    std::cout << msg << "\n";

    // Number formatting
    auto price = std::format(
        "Price: ${:.2f}",
        19.99
    );
    std::cout << price << "\n";

    return 0;
}

Format Tricks

graph TD A["std::format"] --> B["&quot;&#123;&#125;&quot; = Fill in order"] A --> C["&quot;&#123;0&#125;&quot; = First value"] A --> D["&quot;&#123;:.2f&#125;&quot; = 2 decimals"] A --> E["&quot;&#123;:&gt;10&#125;&quot; = Right align"] A --> F["&quot;&#123;:0&gt;5&#125;&quot; = Pad with zeros"]

Formatting Cheat Codes

Format What It Does Example
{} Default display 42 → “42”
{:.2f} 2 decimal places 3.14159 → “3.14”
{:>10} Right-align, width 10 "hi" → " hi"
{:0>5} Pad with zeros 42 → “00042”
{:#x} Hex with 0x prefix 255 → “0xff”

Real-Life Example: Report Card

#include <format>

auto report = std::format(
    "Student: {}\n"
    "Math: {:>3}/100\n"
    "Science: {:>3}/100\n"
    "Average: {:.1f}%",
    "Emma", 95, 88, 91.5
);

Output:

Student: Emma
Math:  95/100
Science:  88/100
Average: 91.5%

🌟 Putting It All Together

Here’s when to use each tool:

Tool Use When…
regex Finding patterns in text
span Viewing array sections
string_view Reading strings fast
std::format Building formatted text

A Complete Story

You’re building a username validator:

#include <regex>
#include <string_view>
#include <format>
#include <iostream>

bool check_username(std::string_view name) {
    // string_view = Fast reading!
    if (name.size() < 3) {
        return false;
    }

    // regex = Pattern matching!
    std::regex pattern(R"([a-zA-Z][a-zA-Z0-9]{2,14})");
    return std::regex_match(
        name.begin(), name.end(), pattern
    );
}

int main() {
    std::string_view user = "Alex123";

    // std::format = Beautiful output!
    if (check_username(user)) {
        std::cout << std::format(
            "✅ '{}' is valid!", user
        );
    } else {
        std::cout << std::format(
            "❌ '{}' is invalid!", user
        );
    }
    return 0;
}

🎯 Key Takeaways

  1. regex = Find patterns like a detective 🔍
  2. span = View arrays through a window 🪟
  3. string_view = Frame text without copying 🖼️
  4. std::format = Paint beautiful output 🎨

You now have four super tools in your C++ toolbox. Each one makes working with text faster, cleaner, and more fun!

Go build something amazing! 🚀

Loading story...

Story - Premium Content

Please sign in to view this story and start learning.

Upgrade to Premium to unlock full access to all stories.

Stay Tuned!

Story is coming soon.

Story Preview

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.