Batch Processing

Back

Loading concept...

🏭 Jakarta Batch Processing: The Factory Assembly Line

Imagine you run a giant chocolate factory. Every day, thousands of chocolate bars need to be made. You can’t make them one by one β€” that would take forever! Instead, you set up an assembly line where chocolates move through stations automatically. That’s exactly what Jakarta Batch Processing does for your data!


🎯 What is Jakarta Batch?

Think of Jakarta Batch as your automatic factory manager. When you have millions of records to process β€” like sending emails to all customers, calculating everyone’s monthly bills, or updating inventory β€” you can’t do it manually. Jakarta Batch sets up a smart assembly line that works tirelessly, even while you sleep!

Real Life Examples:

  • πŸ“§ Sending 1 million newsletter emails overnight
  • πŸ’° Calculating paychecks for all employees at month-end
  • πŸ“Š Processing daily sales reports from 500 stores
  • πŸ”„ Migrating old database records to a new system

πŸ“‹ Job Specification Language (JSL)

The Factory Blueprint

Before building any factory, you need a blueprint. In Jakarta Batch, this blueprint is written in XML and tells the system:

  • What work needs to be done
  • In what order
  • What to do if something goes wrong
<job id="chocolateFactory"
     xmlns="https://jakarta.ee/xml/ns/jakartaee">
    <step id="makeChocolate">
        <!-- Step details here -->
    </step>
</job>

Simple Explanation:

  • <job> = The entire factory plan
  • id = The factory’s name
  • <step> = Each workstation in the factory

🎬 Batch Jobs

The Master Plan

A Job is like the complete recipe for making chocolate bars from start to finish. It contains everything needed:

graph TD A["πŸ“¦ Start Job"] --> B["Step 1: Get Ingredients"] B --> C["Step 2: Mix &amp; Cook"] C --> D["Step 3: Shape &amp; Cool"] D --> E["Step 4: Package"] E --> F["βœ… Job Complete!"]

What Makes Up a Job?

Part What It Does Factory Example
Job ID Unique name β€œDailyChocolateRun”
Steps Individual tasks Mix, Cook, Package
Properties Settings Temperature, Speed
Listeners Monitors Quality checker

Example Job Definition:

<job id="processOrders"
     restartable="true">
    <step id="validateOrders"
          next="fulfillOrders"/>
    <step id="fulfillOrders"
          next="sendConfirmations"/>
    <step id="sendConfirmations"/>
</job>

What this means:

  1. First, check if orders are valid βœ“
  2. Then, fulfill the orders πŸ“¦
  3. Finally, send confirmation emails βœ‰οΈ

πŸͺœ Job Steps

Workstations in Your Factory

Each Step is like one workstation on the assembly line. Workers at each station have ONE specific job to do.

Two Types of Steps:

1️⃣ Chunk Steps (Most Common)

Process items in groups β€” like packaging 100 chocolates at a time

graph TD A["Read 100 items"] --> B["Process each item"] B --> C["Write all 100 to database"] C --> D{More items?} D -->|Yes| A D -->|No| E["Step Complete!"]

2️⃣ Batchlet Steps

One big task β€” like cleaning the entire factory

<step id="cleanupStep">
    <batchlet ref="factoryCleaner"/>
</step>

Step Flow Control:

<step id="step1" next="step2"/>

<step id="step2">
    <next on="COMPLETED" to="step3"/>
    <fail on="FAILED"/>
</step>

Translation:

  • When step1 finishes β†’ go to step2
  • If step2 completes β†’ go to step3
  • If step2 fails β†’ stop everything!

🍫 Chunk-Oriented Processing

The Conveyor Belt System

This is the HEART of batch processing! Imagine:

  1. A box arrives with 100 ingredients (READ)
  2. Each ingredient is checked and prepared (PROCESS)
  3. All 100 go into the mixer together (WRITE)
  4. Repeat until no more boxes!
graph LR A["πŸ“₯ ItemReader"] --> B["βš™οΈ ItemProcessor"] B --> C["πŸ“€ ItemWriter"] C --> D{More?} D -->|Yes| A D -->|No| E["βœ… Done!"]

Chunk Configuration:

<step id="processChocolates">
    <chunk item-count="100">
        <reader ref="ingredientReader"/>
        <processor ref="chocolateMaker"/>
        <writer ref="packageWriter"/>
    </chunk>
</step>

What item-count="100" means:

  • Read 100 items
  • Process all 100
  • Write all 100 to database
  • If something fails, only these 100 are affected!

Why Chunks Are Smart:

Benefit Explanation
Safety If 1 chunk fails, others are safe
Memory Don’t load 1 million items at once
Checkpoints Can restart from last good chunk
Speed Batch database writes are faster

πŸ“– ItemReader

The Ingredient Collector

The ItemReader is like the worker who grabs ingredients from the warehouse. One item at a time, until there’s nothing left.

How It Works:

@Named("orderReader")
public class OrderReader
    implements ItemReader {

    @Override
    public Object readItem() {
        // Get next order from database
        // Return null when done
        Order order = getNextOrder();
        return order;
    }
}

Key Rules:

  • βœ… Returns ONE item at a time
  • βœ… Returns null when no more items
  • βœ… Should be stateless (doesn’t remember past reads)

Common Reader Types:

graph TD A["ItemReader"] --> B["πŸ“„ File Reader"] A --> C["πŸ—ƒοΈ Database Reader"] A --> D["🌐 API Reader"] A --> E["πŸ“¨ Queue Reader"]

Real Example β€” Reading from CSV:

@Override
public Object readItem() {
    String line = csvReader.readLine();
    if (line == null) return null;

    String[] parts = line.split(",");
    return new Customer(
        parts[0],  // name
        parts[1]   // email
    );
}

βš™οΈ ItemProcessor

The Quality Controller

The ItemProcessor is like the worker who inspects and transforms each item. They might:

  • Clean the data
  • Convert formats
  • Skip bad items
  • Add extra information

How It Works:

@Named("orderProcessor")
public class OrderProcessor
    implements ItemProcessor {

    @Override
    public Object processItem(
        Object item) {

        Order order = (Order) item;

        // Skip cancelled orders
        if (order.isCancelled()) {
            return null; // Skip!
        }

        // Calculate total
        order.calculateTotal();

        return order; // Pass along
    }
}

Key Rules:

  • βœ… Receives ONE item
  • βœ… Returns transformed item OR
  • βœ… Returns null to skip item
  • βœ… Should be pure (same input = same output)

Processing Flow:

graph LR A["Raw Order"] --> B{Valid?} B -->|Yes| C["Calculate Total"] C --> D["Add Tax"] D --> E["Processed Order"] B -->|No| F["null/Skip"]

Real Example β€” Email Validation:

@Override
public Object processItem(
    Object item) {

    Customer c = (Customer) item;

    // Skip invalid emails
    if (!isValidEmail(c.getEmail())) {
        return null;
    }

    // Normalize email to lowercase
    c.setEmail(
        c.getEmail().toLowerCase()
    );

    return c;
}

πŸ“€ ItemWriter

The Packaging Team

The ItemWriter is like the team that packages finished products and sends them out. They work on batches, not individual items β€” much more efficient!

How It Works:

@Named("orderWriter")
public class OrderWriter
    implements ItemWriter {

    @Override
    public void writeItems(
        List<Object> items) {

        // Write all items at once!
        for (Object item : items) {
            Order order = (Order) item;
            database.save(order);
        }
    }
}

Key Rules:

  • βœ… Receives a LIST of items (the chunk)
  • βœ… Should write in a transaction
  • βœ… All-or-nothing (all succeed or all fail)

Why Batch Writing Rocks:

One-by-One Batch (100 items)
100 database calls 1 database call
Slow Fast!
100 transactions 1 transaction

Real Example β€” Bulk Insert:

@Override
public void writeItems(
    List<Object> items) {

    // Convert to proper type
    List<Customer> customers =
        items.stream()
        .map(i -> (Customer) i)
        .toList();

    // Single bulk insert!
    repository.saveAll(customers);
}

πŸ”„ The Complete Picture

Let’s see how everything works together:

graph LR subgraph "Job: ProcessMonthlyBills" A["Start"] --> B["Step 1: Read Customers"] B --> C["Step 2: Calculate Bills"] C --> D["Step 3: Send Emails"] D --> E["End"] end subgraph "Inside Step 2" F["ItemReader"] --> G["ItemProcessor"] G --> H["ItemWriter"] end

Complete Job Example:

<job id="monthlyBilling">
    <step id="calculateBills">
        <chunk item-count="500">
            <reader
                ref="customerReader"/>
            <processor
                ref="billCalculator"/>
            <writer
                ref="billWriter"/>
        </chunk>
    </step>
</job>

What Happens:

  1. πŸ“– Read 500 customers
  2. βš™οΈ Calculate bill for each
  3. πŸ“€ Save all 500 bills to database
  4. πŸ”„ Repeat until all customers done!

πŸŽ‰ Summary: Your Batch Processing Factory

Component Role Factory Analogy
Job Master plan Factory blueprint
Step One task Workstation
Chunk Group of items Box of 100 items
ItemReader Get items Warehouse worker
ItemProcessor Transform items Quality checker
ItemWriter Save results Packaging team

Remember This Flow:

πŸ“¦ READ β†’ βš™οΈ PROCESS β†’ πŸ“€ WRITE β†’ πŸ”„ REPEAT

You now understand Jakarta Batch Processing! 🎊

Think of it as building an efficient factory assembly line where:

  • Jobs are your production plans
  • Steps are your workstations
  • Chunks keep work manageable
  • Reader, Processor, Writer are your specialized workers

Happy batch processing! 🏭✨

Loading story...

Story - Premium Content

Please sign in to view this story and start learning.

Upgrade to Premium to unlock full access to all stories.

Stay Tuned!

Story is coming soon.

Story Preview

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.