What is conversation memory in LangChain?

Conversation memory stores chat history so AI can read previous messages before replying. It makes conversations feel natural and connected.

How does message trimming work in LangChain?

Message trimming keeps only recent messages and removes old ones to stay within the AI's token limit. It uses strategies like 'last' to keep newest.

What is RunnableWithMessageHistory?

RunnableWithMessageHistory is a wrapper that automatically saves conversations, loads history per session, and works with any chat model.

What's the difference between short-term and long-term memory?

Short-term memory lasts one conversation. Long-term memory stores facts about users forever, accessible across all future conversations.

Conversation Memory in LangChain | Guide

🧠 Conversation Memory in LangChain

Teaching Your AI to Remember Like a Friend

🎭 The Story: Meet Maya, the Forgetful Robot

Imagine you have a robot friend named Maya. Every time you talk to her, she forgets everything you said before!

You: “Hi Maya, I’m Alex and I love pizza!” Maya: “Nice to meet you!” You: “What’s my name?” Maya: “I don’t know. Who are you?”

Frustrating, right? That’s what happens when AI has no memory.

Now imagine Maya has a magic notebook 📓 where she writes down everything you say. Next time you chat, she reads her notebook first!

You: “What’s my name?” Maya: (checks notebook) “You’re Alex, and you love pizza!”

That magic notebook? That’s Conversation Memory in LangChain!

🗂️ What is Chat History Management?

Think of chat history as Maya’s notebook. Every message you send and every reply she gives gets written down in order.

The Simple Idea

chat_history = [
    {"role": "user", "message": "I'm Alex"},
    {"role": "ai", "message": "Hi Alex!"},
    {"role": "user", "message": "I like pizza"},
    {"role": "ai", "message": "Pizza is yummy!"}
]

Why It Matters

AI reads old messages before replying
Conversations feel natural and connected
No need to repeat yourself every time

graph TD
    A["You Say Something"] --> B["Add to History"]
    B --> C["AI Reads History"]
    C --> D["AI Replies"]
    D --> B

✂️ Message Trimming: Keeping the Notebook Tidy

Maya’s notebook can get too full! If she wrote down 1000 pages of conversation, she’d take forever to read it all.

The Problem

AI has a token limit (like a word budget). Too many messages = AI can’t process them.

The Solution: Trimming

We keep only the most recent messages and throw away old ones.

from langchain_core.messages import trim_messages

# Keep only last 10 messages
trimmed = trim_messages(
    messages,
    max_tokens=1000,
    strategy="last"  # Keep newest
)

Trimming Strategies

Strategy	What It Does	Best For
`last`	Keeps newest messages	Most chats
`first`	Keeps oldest messages	Context setup

Real Example: Your chat has 50 messages. AI can only handle 20. → Trimming keeps messages 31-50, removes 1-30.

🔀 Message Filtering and Merging

Sometimes Maya’s notebook has messy notes. We need to clean it up!

Filtering: Picking What Matters

Maybe we only want to keep certain types of messages:

# Keep only human and AI messages
# Remove system messages from history
filtered = [
    msg for msg in messages
    if msg.type in ["human", "ai"]
]

Merging: Combining Similar Messages

If you sent 3 messages in a row, we can combine them:

Before Merging:

User: “Hi”
User: “I need help”
User: “With Python”

After Merging:

User: “Hi. I need help with Python.”

from langchain_core.messages import merge_messages

merged = merge_messages(messages)

Why Filter and Merge?

📉 Saves tokens (your word budget)
🎯 Focuses on important stuff
🧹 Cleaner conversations

🔗 RunnableWithMessageHistory

This is like giving Maya a super-powered notebook that automatically:

Saves every conversation
Loads the right history for each person
Works with any chat model

The Magic Wrapper

from langchain_core.runnables.history import (
    RunnableWithMessageHistory
)

chain_with_memory = RunnableWithMessageHistory(
    runnable=my_chat_chain,
    get_session_history=get_history_func,
)

How It Works

graph TD
    A["User Sends Message"] --> B["Load Session History"]
    B --> C["Add History to Prompt"]
    C --> D["AI Generates Reply"]
    D --> E["Save New Messages"]
    E --> F["Return Reply"]

Real Example

# Each user gets their own history!
response = chain_with_memory.invoke(
    {"input": "Remember my name is Alex"},
    config={"configurable": {
        "session_id": "alex_chat_001"
    }}
)

Alex’s conversation stays separate from Bob’s! 🎉

💾 Message History Storage

Where does Maya keep her notebook? She needs a safe place to store it!

In-Memory Storage (Temporary)

Like writing on a whiteboard - gone when you restart!

from langchain_community.chat_message_histories import (
    ChatMessageHistory
)

# Simple in-memory storage
memory = ChatMessageHistory()
memory.add_user_message("Hello!")
memory.add_ai_message("Hi there!")

Persistent Storage (Permanent)

Like saving to a real notebook - stays forever!

Popular Options:

Storage	Best For
Redis	Fast, temporary
PostgreSQL	Structured data
MongoDB	Flexible data
File System	Simple projects

from langchain_community.chat_message_histories import (
    RedisChatMessageHistory
)

history = RedisChatMessageHistory(
    session_id="user_123",
    url="redis://localhost:6379"
)

🧵 Thread-Based Conversations

Imagine Maya has multiple notebooks - one for each topic!

What Are Threads?

A thread is a separate conversation stream. Like different chat rooms!

# Thread 1: Talking about cooking
thread_cooking = "user_alex_cooking"

# Thread 2: Talking about coding
thread_coding = "user_alex_coding"

Why Use Threads?

🍳 Ask about recipes in one thread
💻 Ask about Python in another
No confusion between topics!

Example Setup

def get_session_history(session_id: str):
    # Each thread gets its own history
    return get_or_create_history(session_id)

# Cooking conversation
chain.invoke(
    {"input": "How do I make pasta?"},
    config={"configurable": {
        "session_id": "alex_cooking_thread"
    }}
)

# Coding conversation (separate!)
chain.invoke(
    {"input": "How do I write a loop?"},
    config={"configurable": {
        "session_id": "alex_coding_thread"
    }}
)

🗃️ Long-Term Memory Stores

What if Maya could remember things forever? Not just this conversation, but facts about you across all chats!

Short-Term vs Long-Term Memory

Type	Duration	Example
Short-term	One conversation	“You asked about pizza”
Long-term	Forever	“Alex loves pizza”

How Long-Term Memory Works

graph TD
    A["User Shares Info"] --> B["AI Extracts Facts"]
    B --> C["Store in Long-Term DB"]
    D["New Conversation Starts"] --> E[Load User's Facts]
    E --> F["AI Knows User Already!"]

Example: User Profile Memory

# Long-term facts stored separately
user_profile = {
    "name": "Alex",
    "likes": ["pizza", "coding", "cats"],
    "timezone": "EST",
    "skill_level": "beginner"
}

# AI uses these across ALL conversations
prompt = f"""
User Profile: {user_profile}
Chat History: {recent_messages}
User Question: {question}
"""

Building Long-Term Memory

from langchain.memory import VectorStoreRetrieverMemory

# Store important facts in vector database
long_term_memory = VectorStoreRetrieverMemory(
    retriever=vectorstore.as_retriever(),
    memory_key="relevant_history"
)

# AI searches for relevant memories
# when answering questions!

🎯 Putting It All Together

Here’s how Maya’s complete memory system works:

graph TD
    A["User Message Arrives"] --> B{Which Thread?}
    B --> C["Load Thread History"]
    C --> D["Trim if Too Long"]
    D --> E["Filter &amp; Merge"]
    E --> F["Add Long-Term Facts"]
    F --> G["Send to AI"]
    G --> H["Get Response"]
    H --> I["Save to History"]
    I --> J["Extract New Facts"]
    J --> K["Update Long-Term Memory"]

Complete Code Example

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables.history import (
    RunnableWithMessageHistory
)

# 1. Create your chat model
llm = ChatOpenAI(model="gpt-4")

# 2. Create prompt with history
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("placeholder", "{history}"),
    ("human", "{input}")
])

# 3. Build the chain
chain = prompt | llm

# 4. Add memory wrapper
chain_with_memory = RunnableWithMessageHistory(
    chain,
    get_session_history,  # Your function
    input_messages_key="input",
    history_messages_key="history"
)

# 5. Use it!
response = chain_with_memory.invoke(
    {"input": "My name is Alex"},
    config={"configurable": {
        "session_id": "alex_main"
    }}
)

🌟 Key Takeaways

Concept	What It Does	Analogy
Chat History	Stores all messages	Maya’s notebook
Trimming	Removes old messages	Tearing out old pages
Filtering	Keeps only useful messages	Highlighting important notes
Merging	Combines similar messages	Summarizing pages
RunnableWithMessageHistory	Auto-manages history	Self-organizing notebook
Message Storage	Saves history permanently	Saving notebook to cloud
Threads	Separate conversations	Different notebooks
Long-Term Memory	Remembers facts forever	Encyclopedia about user

🚀 You Did It!

Now you understand how to give AI a memory! Your chatbots can:

✅ Remember what users said
✅ Keep conversations organized by thread
✅ Know users across multiple chats
✅ Stay within token limits
✅ Store history permanently

Maya the robot is no longer forgetful. She’s become a great friend who remembers everything that matters! 🤖❤️

Conversation Memory

Unable to load concept

Coming Soon...

🧠 Conversation Memory in LangChain

Teaching Your AI to Remember Like a Friend

🎭 The Story: Meet Maya, the Forgetful Robot

🗂️ What is Chat History Management?

The Simple Idea

Why It Matters

✂️ Message Trimming: Keeping the Notebook Tidy

The Problem

The Solution: Trimming

Trimming Strategies

🔀 Message Filtering and Merging

Filtering: Picking What Matters

Merging: Combining Similar Messages

Why Filter and Merge?

🔗 RunnableWithMessageHistory

The Magic Wrapper

How It Works

Real Example

💾 Message History Storage

In-Memory Storage (Temporary)

Persistent Storage (Permanent)

🧵 Thread-Based Conversations

What Are Threads?

Why Use Threads?

Example Setup

🗃️ Long-Term Memory Stores

Short-Term vs Long-Term Memory

How Long-Term Memory Works

Example: User Profile Memory

Building Long-Term Memory

🎯 Putting It All Together

Complete Code Example

🌟 Key Takeaways

🚀 You Did It!

Story - Premium Content

Stay Tuned!

Story - Premium Content

Interactive - Premium Content

Interactive - Premium Content

Stay Tuned!

Cheatsheet - Premium Content

Cheatsheet - Premium Content

Stay Tuned!

Quiz - Premium Content

Quiz - Premium Content

Stay Tuned!

Flashcard - Premium Content

Flashcard - Premium Content

Stay Tuned!

Sign in Required

Report an Issue