Pandas

Loading concept...

🐼 Pandas: Your Data’s Best Friend

The Story of the Magic Spreadsheet

Imagine you have a giant box of LEGO bricks scattered all over your room. Finding the red ones? Nightmare! Sorting by size? Hours of work! Now imagine a magic helper that can instantly find, sort, combine, and organize ALL your bricks in seconds.

That’s Pandas! 🎉

Pandas is like having a super-smart assistant for your data. It takes messy information and makes it neat, organized, and easy to understand.


🎯 What We’ll Learn

graph TD A[🐼 Pandas Basics] --> B[📊 DataFrames] B --> C[🔧 Data Manipulation] C --> D[📦 GroupBy Magic] D --> E[🔗 Merge & Join] E --> F[📅 DateTime Handling]

📚 Chapter 1: Meet Pandas

What is Pandas?

Think of Pandas as a super-powered spreadsheet that lives inside Python. Just like how Excel has rows and columns, Pandas has them too—but with superpowers!

Real Life Examples:

  • Netflix uses it to organize viewer data
  • Banks use it to track transactions
  • Scientists use it to analyze experiments

Your First Pandas Code

import pandas as pd

# Create a simple table
data = {
    'Name': ['Anna', 'Bob', 'Cara'],
    'Age': [10, 12, 11],
    'Score': [95, 88, 92]
}

df = pd.DataFrame(data)
print(df)

Output:

   Name  Age  Score
0  Anna   10     95
1   Bob   12     88
2  Cara   11     92

🎈 Think of it: Each row is a person, each column is information about them.


📊 Chapter 2: The DataFrame - Your Data Table

What’s a DataFrame?

A DataFrame is like a table in a notebook. It has:

  • Rows = Each item (like each student)
  • Columns = Information about items (like name, age, score)
  • Index = Row numbers (like seat numbers)

Creating DataFrames

Method 1: From a Dictionary

students = {
    'Name': ['Emma', 'Liam'],
    'Grade': ['A', 'B']
}
df = pd.DataFrame(students)

Method 2: From a List

data = [
    ['Emma', 'A'],
    ['Liam', 'B']
]
df = pd.DataFrame(data,
    columns=['Name', 'Grade'])

Selecting Data

# Get one column
df['Name']

# Get multiple columns
df[['Name', 'Grade']]

# Get one row by position
df.iloc[0]  # First row

# Get rows by condition
df[df['Grade'] == 'A']

🧙‍♂️ Magic Tip: iloc = position (like “item 0”), loc = label (like “row named X”)


🔧 Chapter 3: Data Manipulation

The Art of Shaping Data

Imagine you’re a chef preparing ingredients. Sometimes you need to:

  • Add new ingredients (new columns)
  • Remove bad parts (drop columns/rows)
  • Change how things look (transform data)
  • Filter out what you don’t need

Adding New Columns

df['Bonus'] = df['Score'] * 0.1

# Or with a condition
df['Pass'] = df['Score'] >= 60

Removing Data

# Drop a column
df = df.drop('Bonus', axis=1)

# Drop a row
df = df.drop(0, axis=0)

# Drop rows with missing values
df = df.dropna()

Changing Values

# Replace values
df['Grade'] = df['Grade'].replace(
    'F', 'Fail')

# Apply a function
df['Name'] = df['Name'].str.upper()

Filtering Data

# Students who passed
passed = df[df['Score'] >= 60]

# Multiple conditions
stars = df[
    (df['Score'] >= 90) &
    (df['Age'] <= 12)
]

🎯 Remember: & means AND, | means OR


📦 Chapter 4: GroupBy - Sorting into Buckets

The Bucket Story

Imagine you have a basket of mixed fruits and you want to:

  1. Group them by type (apples together, oranges together)
  2. Count how many of each
  3. Find the biggest one in each group

That’s exactly what GroupBy does!

Basic GroupBy

# Sample data
sales = pd.DataFrame({
    'Store': ['A', 'B', 'A', 'B'],
    'Product': ['Apple', 'Apple',
                'Banana', 'Banana'],
    'Amount': [10, 15, 20, 25]
})

# Group by Store and sum
by_store = sales.groupby('Store')
print(by_store['Amount'].sum())

Output:

Store
A    30
B    40

Multiple Aggregations

# Get multiple stats at once
stats = sales.groupby('Store').agg({
    'Amount': ['sum', 'mean', 'count']
})

Group by Multiple Columns

detailed = sales.groupby(
    ['Store', 'Product']
)['Amount'].sum()

🪣 Think of it: GroupBy = Put similar things in buckets, then do math on each bucket!


🔗 Chapter 5: Merge and Join

The Puzzle Piece Story

Imagine you have two puzzle pieces that belong together:

  • Piece 1: Student names and their IDs
  • Piece 2: IDs and their test scores

To see “which student got which score,” you need to connect the pieces using the ID!

Types of Joins

graph TD A[Two Tables] --> B[Inner Join] A --> C[Left Join] A --> D[Right Join] A --> E[Outer Join] B --> F[Only matching rows] C --> G[All left + matching right] D --> H[All right + matching left] E --> I[All rows from both]

Merge Example

# Table 1: Students
students = pd.DataFrame({
    'ID': [1, 2, 3],
    'Name': ['Anna', 'Bob', 'Cara']
})

# Table 2: Scores
scores = pd.DataFrame({
    'ID': [1, 2, 4],
    'Score': [95, 88, 75]
})

# Inner join (only matching IDs)
result = pd.merge(
    students, scores,
    on='ID', how='inner'
)

Result:

   ID  Name  Score
0   1  Anna     95
1   2   Bob     88

Different Join Types

# Left join - keep all students
left = pd.merge(
    students, scores,
    on='ID', how='left'
)

# Outer join - keep everyone
outer = pd.merge(
    students, scores,
    on='ID', how='outer'
)

🧩 Remember:

  • inner = Only matches
  • left = All from left table
  • right = All from right table
  • outer = Everything from both

📅 Chapter 6: DateTime Handling

Time is Data Too!

Dates and times are special. They’re not just numbers or text—they have meaning! Pandas understands this.

Examples of date questions:

  • “How many sales in January?”
  • “What day had the most visitors?”
  • “How many hours between these events?”

Converting to DateTime

# Create dates from strings
df = pd.DataFrame({
    'date': ['2024-01-15',
             '2024-02-20',
             '2024-03-25']
})

# Convert to datetime
df['date'] = pd.to_datetime(df['date'])

Extracting Date Parts

# Get year, month, day
df['year'] = df['date'].dt.year
df['month'] = df['date'].dt.month
df['day'] = df['date'].dt.day

# Get day of week (0=Monday)
df['weekday'] = df['date'].dt.dayofweek

# Get day name
df['day_name'] = df['date'].dt.day_name()

Date Math

# Add days
df['next_week'] = df['date'] + \
    pd.Timedelta(days=7)

# Find difference
df['diff'] = df['date'].diff()

# Resample by month
monthly = df.set_index('date').\
    resample('M').sum()

Filtering by Date

# Sales in 2024
sales_2024 = df[
    df['date'].dt.year == 2024
]

# Between two dates
jan_sales = df[
    (df['date'] >= '2024-01-01') &
    (df['date'] <= '2024-01-31')
]

📆 Pro Tip: Always convert date strings to datetime FIRST, then do operations!


🎉 You Did It!

You’ve learned the six superpowers of Pandas:

Power What It Does
🐼 Pandas Basics Import and create data
📊 DataFrames Organize in rows/columns
🔧 Manipulation Add, remove, change data
📦 GroupBy Sort into buckets & summarize
🔗 Merge/Join Connect two tables
📅 DateTime Work with dates & times

🚀 Quick Reference

import pandas as pd

# Create DataFrame
df = pd.DataFrame(data)

# Select
df['column']      # One column
df[['a', 'b']]    # Multiple columns
df.iloc[0]        # Row by position
df.loc[0]         # Row by label

# Filter
df[df['col'] > 5]

# Group
df.groupby('col').sum()

# Merge
pd.merge(df1, df2, on='key')

# DateTime
pd.to_datetime(df['date'])
df['date'].dt.year

Remember: Data is like LEGO bricks. Pandas helps you build anything you imagine! 🧱✨

Loading story...

No Story Available

This concept doesn't have a story yet.

Story Preview

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

Interactive Preview

Interactive - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

No Interactive Content

This concept doesn't have interactive content yet.

Cheatsheet Preview

Cheatsheet - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

No Cheatsheet Available

This concept doesn't have a cheatsheet yet.

Quiz Preview

Quiz - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

No Quiz Available

This concept doesn't have a quiz yet.