Sorting and Searching

Back

Loading concept...

🎯 NumPy Sorting & Searching: Finding Things Fast!

The Messy Drawer Problem

Imagine you have a drawer full of toys. They’re all mixed up—cars, dolls, blocks, everything jumbled together. Now your friend asks: “Do you have a red car?”

Two ways to find it:

  1. The slow way: Pick up every single toy, check if it’s a red car. This takes forever!
  2. The fast way: First, organize the toys. Put all cars together. Now finding the red car is super quick!

This is exactly why sorting and searching matter in NumPy. When your data is organized, finding things becomes lightning fast! ⚡


🔢 Sorting Arrays: Putting Things in Order

Sorting means arranging things from smallest to biggest (or biggest to smallest).

The Magic Spell: np.sort()

import numpy as np

messy = np.array([5, 2, 8, 1, 9])
tidy = np.sort(messy)

print(tidy)  # [1 2 5 8 9]

What happened? NumPy looked at all the numbers and lined them up from smallest to biggest!

Sorting Doesn’t Change the Original

Here’s something cool: np.sort() gives you a NEW sorted array. Your original stays messy!

messy = np.array([5, 2, 8, 1, 9])
tidy = np.sort(messy)

print("Original:", messy)  # [5 2 8 1 9]
print("Sorted:", tidy)     # [1 2 5 8 9]

Sorting Backwards (Biggest First)

What if you want the biggest numbers first? Easy trick!

scores = np.array([85, 92, 78, 95, 88])
top_first = np.sort(scores)[::-1]

print(top_first)  # [95 92 88 85 78]

The [::-1] part flips everything around!

Sorting 2D Arrays: Rows vs Columns

When you have a grid of numbers, you can sort by rows OR columns.

grid = np.array([
    [3, 1, 4],
    [1, 5, 9],
    [2, 6, 5]
])

# Sort each row (left to right)
by_row = np.sort(grid, axis=1)
print(by_row)
# [[1 3 4]
#  [1 5 9]
#  [2 5 6]]

# Sort each column (top to bottom)
by_col = np.sort(grid, axis=0)
print(by_col)
# [[1 1 4]
#  [2 5 5]
#  [3 6 9]]

Think of axis=0 as “up and down” and axis=1 as “left and right.”


🏷️ Indirect Sorting with argsort: “Where Should This Go?”

Sometimes you don’t want the sorted values—you want to know WHERE each value should go!

The Story of the Race

Imagine 4 friends finished a race with these times:

Friend A: 12 seconds
Friend B: 8 seconds
Friend C: 15 seconds
Friend D: 10 seconds

Who came 1st, 2nd, 3rd, 4th?

times = np.array([12, 8, 15, 10])
positions = np.argsort(times)

print(positions)  # [1 3 0 2]

What does [1 3 0 2] mean?

  • Position 0 (1st place): Index 1 → Friend B (8 sec) 🥇
  • Position 1 (2nd place): Index 3 → Friend D (10 sec) 🥈
  • Position 2 (3rd place): Index 0 → Friend A (12 sec) 🥉
  • Position 3 (4th place): Index 2 → Friend C (15 sec)

Using argsort to Get Sorted Values

Here’s a cool trick: use those positions to sort!

times = np.array([12, 8, 15, 10])
order = np.argsort(times)

sorted_times = times[order]
print(sorted_times)  # [8 10 12 15]

Why Is argsort Useful?

Real example: You have student names and their scores:

names = np.array(['Ali', 'Bob', 'Cat', 'Dan'])
scores = np.array([75, 92, 88, 65])

# Sort by scores but keep names matched!
order = np.argsort(scores)[::-1]  # Best first

for i in order:
    print(f"{names[i]}: {scores[i]}")

# Bob: 92
# Cat: 88
# Ali: 75
# Dan: 65

Without argsort, keeping names matched to scores would be tricky!


⚡ Partial Sorting with partition: “I Just Need the Top Few!”

Imagine you have 1000 test scores but only need the top 10. Why sort ALL 1000?

The Smart Shortcut: np.partition()

Partition does just enough work to find what you need!

scores = np.array([45, 92, 78, 88, 55, 99, 72, 85])

# Get the 3 smallest scores
partial = np.partition(scores, 3)
print(partial)
# [45 55 72 78 88 99 92 85]
#  ↑  ↑  ↑
# These 3 are the smallest (not sorted among themselves)

What partition does:

  • Puts the 3 smallest values at the front
  • Everything else goes after
  • But it DOESN’T fully sort them!

Why Not Just Sort?

Sorting 1 million numbers takes time. But partition is MUCH faster when you only need a few values!

big_array = np.random.rand(1000000)

# Get just the 5 smallest
top_5_area = np.partition(big_array, 5)[:5]

This is way faster than sorting all million numbers!

Finding the Top K Values

Want the biggest instead of smallest? Use negative!

scores = np.array([45, 92, 78, 88, 55, 99, 72, 85])

# Get the 3 BIGGEST scores
partial = np.partition(scores, -3)
print(partial[-3:])  # [88 92 99]

argpartition: Positions of Top Values

Just like argsort, there’s np.argpartition()!

scores = np.array([45, 92, 78, 88, 55, 99, 72, 85])

# Which students are in top 3?
top_3_indices = np.argpartition(scores, -3)[-3:]
print(top_3_indices)  # [3 1 5] → positions of 88, 92, 99

🔍 Searching in Sorted Arrays: Finding Things Super Fast

Once your array is sorted, finding things becomes magical!

The Phone Book Trick: Binary Search

Remember phone books? They’re sorted by name. To find “Smith”:

  1. Open to the middle: “M”—too early!
  2. Open to 3/4: “R”—still early!
  3. Open to 7/8: “T”—too far!
  4. Open between: Found “Smith”!

This is called binary search—cutting the search in half each time!

np.searchsorted(): Where Would This Fit?

sorted_arr = np.array([10, 20, 30, 40, 50])

# Where would 25 fit?
pos = np.searchsorted(sorted_arr, 25)
print(pos)  # 2

# Insert 25 at position 2 → [10, 20, 25, 30, 40, 50]

NumPy tells you: “Put 25 at position 2 to keep it sorted!”

Finding Multiple Values at Once

sorted_arr = np.array([10, 20, 30, 40, 50])
values = np.array([15, 35, 55])

positions = np.searchsorted(sorted_arr, values)
print(positions)  # [1 3 5]

# 15 → position 1 (between 10 and 20)
# 35 → position 3 (between 30 and 40)
# 55 → position 5 (after everything)

left vs right: Handling Duplicates

What if the value already exists?

arr = np.array([10, 20, 20, 20, 30])

left = np.searchsorted(arr, 20, side='left')
right = np.searchsorted(arr, 20, side='right')

print(f"Left: {left}")   # 1 (before all 20s)
print(f"Right: {right}") # 4 (after all 20s)

Checking If a Value Exists

sorted_arr = np.array([10, 20, 30, 40, 50])

def is_in_sorted(arr, value):
    pos = np.searchsorted(arr, value)
    return pos < len(arr) and arr[pos] == value

print(is_in_sorted(sorted_arr, 30))  # True
print(is_in_sorted(sorted_arr, 35))  # False

🎯 Summary: When to Use What

graph TD A["Need to organize data?"] --> B{Full sort needed?} B -->|Yes| C["np.sort"] B -->|Only top/bottom K| D["np.partition"] E["Need positions, not values?"] --> F{All positions?} F -->|Yes| G["np.argsort"] F -->|Only top K positions| H["np.argpartition"] I["Array already sorted?"] --> J["np.searchsorted"]
Task Function Speed
Sort everything np.sort() Normal
Get top/bottom K np.partition() Fast!
Find insert position np.searchsorted() Super fast!
Get sort order np.argsort() Normal
Get top K indices np.argpartition() Fast!

🚀 You Did It!

Now you know how to:

  • ✅ Sort arrays with np.sort()
  • ✅ Get positions with np.argsort()
  • ✅ Find top values quickly with np.partition()
  • ✅ Search sorted arrays with np.searchsorted()

Think of your data as toys in a drawer. Keep them organized, and you’ll always find what you need—fast! 🎉

Loading story...

Story - Premium Content

Please sign in to view this story and start learning.

Upgrade to Premium to unlock full access to all stories.

Stay Tuned!

Story is coming soon.

Story Preview

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.