๐๏ธ NumPy Structured Arrays: Your Data Filing Cabinet
The Story of Messy Data
Imagine you have a magic filing cabinet. But instead of just putting papers in random folders, this cabinet is SUPER organized. Each drawer has labeled compartments: one for names, one for ages, one for scores. Thatโs what structured arrays are in NumPy!
Regular arrays are like a box of identical marblesโall the same type. But what if you need to store a studentโs name (text), age (number), AND grade (decimal)? You need a structured arrayโa box with different-shaped compartments!
๐ฆ Structured Arrays Basics
Whatโs the Big Deal?
Think of a spreadsheet row. Each column has different data:
| Name | Age | Score |
|---|---|---|
| Mia | 10 | 95.5 |
A structured array lets you store this whole row as one item!
Your First Structured Array
import numpy as np
# Create a simple structured array
student = np.array(
[('Mia', 10, 95.5)],
dtype=[('name', 'U10'),
('age', 'i4'),
('score', 'f4')]
)
print(student)
# Output: [('Mia', 10, 95.5)]
What just happened?
'U10'= Unicode string, max 10 characters (for name)'i4'= 4-byte integer (for age)'f4'= 4-byte float (for score)
Itโs like telling the filing cabinet: โThis drawer needs a text slot, a number slot, and a decimal slot!โ
๐๏ธ Defining Compound dtypes
The Blueprint of Your Data
A dtype (data type) is like a blueprint. A compound dtype is a blueprint with multiple rooms!
Three Ways to Build Your Blueprint
Way 1: List of Tuples (Easiest!)
# Each tuple = (field_name, data_type)
dt = np.dtype([
('name', 'U20'),
('age', 'i4'),
('height', 'f8')
])
people = np.array([
('Leo', 8, 4.2),
('Zoe', 9, 4.5)
], dtype=dt)
Way 2: Dictionary Style
dt = np.dtype({
'names': ['x', 'y', 'z'],
'formats': ['f4', 'f4', 'f4']
})
points = np.array([
(1.0, 2.0, 3.0),
(4.0, 5.0, 6.0)
], dtype=dt)
Way 3: String Shortcut
# Quick format: 'type,type,type'
dt = np.dtype('U10,i4,f8')
data = np.array([
('Apple', 5, 1.99),
('Banana', 8, 0.79)
], dtype=dt)
Common Type Codes
'U10' โ String (10 chars max)
'i4' โ Integer (4 bytes)
'i8' โ Big integer (8 bytes)
'f4' โ Float (4 bytes)
'f8' โ Double precision float
'b' โ Boolean
๐ Structured Array Fields
Accessing Your Compartments
Fields are like labeled drawers. You can open any drawer by its name!
students = np.array([
('Emma', 11, 92.0),
('Noah', 10, 88.5),
('Lily', 12, 95.0)
], dtype=[
('name', 'U10'),
('age', 'i4'),
('grade', 'f4')
])
# Get ALL names (one field)
print(students['name'])
# Output: ['Emma' 'Noah' 'Lily']
# Get ALL ages
print(students['age'])
# Output: [11 10 12]
Accessing One Student (One Record)
# First student (index 0)
print(students[0])
# Output: ('Emma', 11, 92.)
# Third student's name
print(students[2]['name'])
# Output: 'Lily'
Modifying Fields
# Give everyone a birthday!
students['age'] = students['age'] + 1
print(students['age'])
# Output: [12 11 13]
# Update one grade
students[1]['grade'] = 90.0
Getting Field Names
# What fields do we have?
print(students.dtype.names)
# Output: ('name', 'age', 'grade')
๐ฏ Multiple Fields at Once
You can grab several drawers together!
# Get name and grade only
subset = students[['name', 'grade']]
print(subset)
# Output: [('Emma', 92.) ('Noah', 90.) ('Lily', 95.)]
๐งฉ Nested Structured Arrays
Arrays inside arrays! Like a filing cabinet with mini-cabinets inside.
# Define a nested dtype
address_type = np.dtype([
('city', 'U20'),
('zip', 'i4')
])
person_type = np.dtype([
('name', 'U20'),
('address', address_type)
])
# Create nested data
people = np.array([
('Alice', ('Boston', 2115)),
('Bob', ('Miami', 33101))
], dtype=person_type)
# Access nested field
print(people[0]['address']['city'])
# Output: 'Boston'
๐ Why Use Structured Arrays?
graph TD A["Your Data"] --> B{Same type?} B -->|Yes| C["Regular Array"] B -->|No| D["Structured Array"] D --> E["Mixed types together"] D --> F["Named fields"] D --> G["Easy access"]
Perfect For:
- ๐ CSV-like data (rows with different columns)
- ๐ฎ Game objects (name, health, position)
- ๐ Scientific records (timestamp, sensor1, sensor2)
๐ก Quick Tips
- Use descriptive field names -
'temperature'not't' - Pick the right size -
'U10'for short names,'U100'for long text - Access by field name for clarity:
arr['age']notarr[:,1]
๐ You Did It!
You now know:
- โ What structured arrays ARE (filing cabinets with compartments!)
- โ How to DEFINE compound dtypes (blueprints with multiple rooms)
- โ How to ACCESS fields (open any drawer by name)
Structured arrays turn messy, mixed data into organized, fast, NumPy-powered data! ๐
