What is the KMP algorithm?

KMP (Knuth-Morris-Pratt) builds an LPS array to remember partial matches. When a mismatch occurs, it skips ahead instead of starting over.

How does Rabin-Karp algorithm work?

Rabin-Karp uses hash fingerprints for quick comparison. If hashes match, it verifies actual characters. Rolling hash updates efficiently.

What is Longest Common Prefix (LCP)?

LCP finds the longest shared beginning among strings. It scans character by character until a mismatch is found.

String Matching Algorithms | Java DSA Guide

🔍 String Matching: Finding Needles in a Haystack

Imagine you lost your favorite toy in a huge pile of similar-looking toys. How would you find it? Would you check every single toy one by one? That works, but it’s slow! What if you had a magic trick to skip the toys you already know don’t match?

That’s exactly what string matching algorithms do! They help computers find a small word (called a pattern) inside a big text (called the haystack) — and they do it super fast.

🎯 What You’ll Learn

KMP Algorithm — The “Memory Helper” that never forgets
Rabin-Karp Algorithm — The “Fingerprint Detective”
Longest Common Prefix (LCP) — Finding what strings have in common

📚 The Story of the Lost Word

Let’s say you have a long story:

TEXT: "ABABDABACDABABCABAB"

And you’re looking for:

PATTERN: "ABABCABAB"

The slow way: Compare every letter, one by one. If it doesn’t match, start over from the next position. This is like checking every toy in the pile, even if you just saw it!

The smart way: Remember what you already checked. If something doesn’t match, use your memory to skip ahead!

🧠 1. KMP Algorithm (Knuth-Morris-Pratt)

The “Memory Helper” Trick

Imagine this: You’re reading a book looking for the word “ABABC”. You’re at “ABAB” and the next letter is “D” instead of “C”.

❌ Dumb approach: Go back to the beginning and start over.

✅ KMP approach: Wait! “AB” at the end of “ABAB” looks just like “AB” at the beginning! I can continue from there!

How KMP Works

KMP builds a “failure function” (also called LPS — Longest Prefix Suffix) that tells us: “If we fail here, where can we safely continue?”

graph TD
    A["Start Matching"] --> B{Characters Match?}
    B -->|Yes| C["Move Both Pointers Forward"]
    C --> B
    B -->|No| D{Pattern at Start?}
    D -->|Yes| E["Move Text Pointer Only"]
    D -->|No| F["Use LPS to Skip Ahead"]
    F --> B
    E --> B

Building the LPS Array

For pattern “ABABCABAB”:

Index	0	1	2	3	4	5	6	7	8
Char	A	B	A	B	C	A	B	A	B
LPS	0	0	1	2	0	1	2	3	4

What does this mean?

LPS[3] = 2 means: At position 3, the longest prefix that’s also a suffix is length 2 (“AB”)
If we fail after position 3, we can skip back to position 2 instead of starting over!

Java Code Example

public int[] buildLPS(String pattern) {
    int[] lps = new int[pattern.length()];
    int len = 0;
    int i = 1;

    while (i < pattern.length()) {
        if (pattern.charAt(i) ==
            pattern.charAt(len)) {
            len++;
            lps[i] = len;
            i++;
        } else if (len > 0) {
            len = lps[len - 1];
        } else {
            lps[i] = 0;
            i++;
        }
    }
    return lps;
}

Simple Example

Finding “ABA” in “ABABA”:

Text:    A B A B A
Pattern: A B A
         ✓ ✓ ✓ → Found at index 0!

Move pattern using LPS...

Text:    A B A B A
Pattern:     A B A
             ✓ ✓ ✓ → Found at index 2!

Why KMP is Fast: Time complexity is O(n + m) where n = text length, m = pattern length. No backtracking!

🔐 2. Rabin-Karp Algorithm

The “Fingerprint Detective” Trick

Imagine: Instead of comparing every letter, what if each word had a fingerprint (a number)? You could just compare fingerprints!

If fingerprints don’t match → Words definitely don’t match ❌
If fingerprints match → Check the actual letters (might be a false alarm!) ✓

The Magic: Rolling Hash

The secret sauce is the rolling hash. When you slide your window by one character:

Window 1: [A B C] D E → Hash = 123
Window 2: A [B C D] E → Hash = 234 (calculated from 123!)

You don’t recalculate everything! You just:

Remove the old first character’s contribution
Add the new last character

Java Code Example

public void rabinKarp(String text,
                       String pattern) {
    int d = 256;  // alphabet size
    int q = 101;  // a prime number
    int m = pattern.length();
    int n = text.length();
    int h = 1;

    // h = d^(m-1) % q
    for (int i = 0; i < m - 1; i++)
        h = (h * d) % q;

    // Calculate initial hashes
    int p = 0, t = 0;
    for (int i = 0; i < m; i++) {
        p = (d * p + pattern.charAt(i)) % q;
        t = (d * t + text.charAt(i)) % q;
    }

    // Slide and compare
    for (int i = 0; i <= n - m; i++) {
        if (p == t) {
            // Verify actual characters
            if (text.substring(i, i+m)
                    .equals(pattern))
                System.out.println(
                    "Found at: " + i);
        }
        // Calculate next hash
        if (i < n - m) {
            t = (d * (t - text.charAt(i) * h)
                 + text.charAt(i + m)) % q;
            if (t < 0) t += q;
        }
    }
}

Simple Example

Finding “AB” in “AABAB”:

Let’s use a simple hash: A=1, B=2

Pattern "AB" → Hash = 1×10 + 2 = 12

Text positions:
[AA] → 1×10 + 1 = 11 ≠ 12 ❌
[AB] → 1×10 + 2 = 12 = 12 ✓ Check! Match!
[BA] → 2×10 + 1 = 21 ≠ 12 ❌
[AB] → 1×10 + 2 = 12 = 12 ✓ Check! Match!

Why Rabin-Karp is Cool:

Great for multiple pattern matching (search for many words at once!)
Average time: O(n + m)
Worst case (many false positives): O(nm)

🔗 3. Longest Common Prefix (LCP)

Finding What Strings Share

Question: What do these words have in common at the start?

"flower"
"flow"
"flight"

Answer: “fl” — That’s the Longest Common Prefix!

Two Approaches

Approach 1: Horizontal Scanning

Compare strings two at a time:

graph TD
    A["flower, flow, flight"] --> B["LCP&#35;40;flower, flow&#35;41; = flow"]
    B --> C["LCP&#35;40;flow, flight&#35;41; = fl"]
    C --> D["Result: fl"]

Approach 2: Vertical Scanning

Compare column by column:

f l o w e r
f l o w
f l i g h t
↓ ↓ ↓
f l ✗ → Stop! LCP = "fl"

Java Code: Vertical Scanning

public String longestCommonPrefix(
        String[] strs) {
    if (strs == null || strs.length == 0)
        return "";

    for (int i = 0;
         i < strs[0].length(); i++) {
        char c = strs[0].charAt(i);

        for (int j = 1;
             j < strs.length; j++) {
            if (i >= strs[j].length() ||
                strs[j].charAt(i) != c) {
                return strs[0]
                    .substring(0, i);
            }
        }
    }
    return strs[0];
}

Simple Example

String[] words = {"interstellar",
                  "internet",
                  "internal"};

// Column check:
// i: ✓ n: ✓ t: ✓ e: ✓ r: ✓ n: ?
// "interstellar" has 'n'
// "internet" has 'n'
// "internal" has 'n' ✓
// Continue...
// "interstellar" has 's'
// "internet" has 'e' ❌ STOP!

// Result: "intern"

🎯 When to Use What?

Algorithm	Best For	Time	Space
KMP	Single pattern, no preprocessing overhead	O(n+m)	O(m)
Rabin-Karp	Multiple patterns, plagiarism detection	O(n+m) avg	O(1)
LCP	Finding common beginnings in word lists	O(S)	O(1)

S = sum of all characters in all strings

🌟 Real-World Magic

These algorithms power things you use every day:

🔍 Search engines — Finding your query in billions of pages
📝 Text editors — Ctrl+F (Find and Replace)
🧬 DNA matching — Finding gene sequences
🔒 Plagiarism checkers — Finding copied text
🌐 Autocomplete — Suggesting words as you type (uses LCP!)

💡 Key Takeaways

KMP = Build a “memory map” (LPS array) so you never check the same thing twice
Rabin-Karp = Use “fingerprints” (hashes) for quick comparisons
LCP = Scan character-by-character to find shared beginnings

Remember the metaphors:

🧠 KMP is your smart friend with perfect memory
🔍 Rabin-Karp is a detective with fingerprint files
🔗 LCP is finding the family resemblance between words

🚀 You Did It!

You now understand how computers find words at lightning speed! These algorithms might seem tricky at first, but remember:

KMP: “If I’ve seen this before, I can skip ahead!”
Rabin-Karp: “Let me check the fingerprint first!”
LCP: “What do we all have in common?”

Go forth and search strings like a pro! 🎉

String Matching

Unable to load concept

Coming Soon...

🔍 String Matching: Finding Needles in a Haystack

🎯 What You’ll Learn

📚 The Story of the Lost Word

🧠 1. KMP Algorithm (Knuth-Morris-Pratt)

The “Memory Helper” Trick

How KMP Works

Building the LPS Array

Java Code Example

Simple Example

🔐 2. Rabin-Karp Algorithm

The “Fingerprint Detective” Trick

The Magic: Rolling Hash

Java Code Example

Simple Example

🔗 3. Longest Common Prefix (LCP)

Finding What Strings Share

Two Approaches

Approach 1: Horizontal Scanning

Approach 2: Vertical Scanning

Java Code: Vertical Scanning

Simple Example

🎯 When to Use What?

🌟 Real-World Magic

💡 Key Takeaways

🚀 You Did It!

Story - Premium Content

Stay Tuned!

Story - Premium Content

Interactive - Premium Content

Interactive - Premium Content

Stay Tuned!

Cheatsheet - Premium Content

Cheatsheet - Premium Content

Stay Tuned!

Quiz - Premium Content

Quiz - Premium Content

Stay Tuned!

Flashcard - Premium Content

Flashcard - Premium Content

Stay Tuned!

Sign in Required

Report an Issue