What are embeddings in LangChain?

Embeddings convert text into lists of numbers that capture meaning. Similar texts have nearby numbers, letting AI find related content.

Why use CacheBackedEmbeddings?

Caching avoids re-computing embeddings for the same text. It can be 375x faster and saves API costs on repeated queries.

What is cosine similarity for embeddings?

Cosine similarity measures how close two embeddings are. A score of 1.0 means identical, 0 means unrelated, and -1 means opposite.

Embeddings in LangChain | Vector Search Guide

🧭 Vector Search and RAG: The Magic of Embeddings

The Story of the Librarian Who Understood Meaning

Imagine you’re a librarian in a magical library. But this isn’t an ordinary library — books here don’t have titles or authors on the cover. Instead, each book is stored as a special “location code” that tells you what the book is about, not just what it’s called.

When someone asks for “a story about brave heroes fighting dragons,” you don’t search for those exact words. Instead, you find books whose location codes are close together in your magical map — because similar stories live near each other!

That’s exactly what embeddings do for AI! 🎯

🎪 What Are Embeddings? (Overview)

The Simple Idea

An embedding is like a secret address for words, sentences, or documents.

Instead of storing text as letters (“Hello”), we convert it into a list of numbers:

"Hello" → [0.23, -0.45, 0.89, 0.12, ...]

These numbers capture the meaning — not just the spelling.

Why Numbers?

Think of it like GPS coordinates! 🗺️

Your Address	GPS Coordinates
“My House”	(40.7128, -74.0060)
“Neighbor’s House”	(40.7130, -74.0058)

Close coordinates = close locations!

With embeddings:

“Happy” and “Joyful” → nearby numbers
“Happy” and “Refrigerator” → far apart numbers

Real Example

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

# Turn text into numbers!
result = embeddings.embed_query("I love pizza")
print(len(result))  # 1536 numbers!

The computer now “understands” that “I love pizza” is similar to “Pizza is my favorite food” — because their number lists look alike!

⚙️ Embedding Model Configuration

Picking Your Translator

Different embedding models are like different translators. Some are fast, some are accurate, some are cheap!

graph TD
    A["Choose Model"] --> B{What matters most?}
    B -->|Speed| C["text-embedding-3-small"]
    B -->|Accuracy| D["text-embedding-3-large"]
    B -->|Free/Local| E["HuggingFace Models"]
    B -->|Privacy| F["Ollama Local"]

OpenAI Configuration

from langchain_openai import OpenAIEmbeddings

# Basic setup
embeddings = OpenAIEmbeddings(
    model="text-embedding-3-small"
)

# With more options
embeddings = OpenAIEmbeddings(
    model="text-embedding-3-large",
    dimensions=1024  # Smaller = faster
)

HuggingFace (Free!)

from langchain_huggingface import (
    HuggingFaceEmbeddings
)

embeddings = HuggingFaceEmbeddings(
    model_name="all-MiniLM-L6-v2"
)

Configuration Tips

Setting	What It Does
`model`	Which brain to use
`dimensions`	Size of number list
`chunk_size`	How many at once

Remember: Bigger isn’t always better! A smaller, faster model often works great.

🛠️ Creating Embeddings

Two Types of Creation

Think of embeddings like making IDs:

Query Embeddings — for questions you ask
Document Embeddings — for information you store

graph TD
    A["Your Text"] --> B{What is it?}
    B -->|A Question| C["embed_query"]
    B -->|Info to Store| D["embed_documents"]
    C --> E["One Vector"]
    D --> F["List of Vectors"]

Single Query

When someone asks a question:

# User asks: "What's the weather?"
question = "What's the weather?"

# Turn it into numbers
query_vector = embeddings.embed_query(
    question
)

print(type(query_vector))  # list
print(len(query_vector))   # 1536

Multiple Documents

When storing information:

# Your knowledge base
docs = [
    "The sun is a star",
    "Water boils at 100°C",
    "Python is a programming language"
]

# Turn ALL into numbers
doc_vectors = embeddings.embed_documents(
    docs
)

print(len(doc_vectors))     # 3 vectors
print(len(doc_vectors[0]))  # 1536 each

Quick Example: Finding Similar Text

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

# Store some facts
facts = [
    "Dogs are loyal pets",
    "Cats are independent",
    "Fish live in water"
]
fact_vectors = embeddings.embed_documents(facts)

# Ask a question
question = "Which pet loves its owner?"
q_vector = embeddings.embed_query(question)

# Now compare! (we'll learn how soon)

💾 CacheBackedEmbeddings

The Problem

Creating embeddings costs:

⏱️ Time — API calls take milliseconds
💰 Money — Each call = more charges
🌐 Bandwidth — Network requests

What if you embed the same text twice? Wasteful!

The Solution: Caching!

Imagine a notebook 📓 where you write down every translation you’ve ever done. Next time someone asks for the same translation — just look it up!

graph TD
    A["Text to Embed"] --> B{In Cache?}
    B -->|Yes!| C["Return Saved Result"]
    B -->|No| D["Call API"]
    D --> E["Save to Cache"]
    E --> C

How to Use It

from langchain.embeddings import (
    CacheBackedEmbeddings
)
from langchain.storage import (
    LocalFileStore
)
from langchain_openai import OpenAIEmbeddings

# 1. Create your base embeddings
base = OpenAIEmbeddings()

# 2. Create a place to store cache
store = LocalFileStore("./cache/")

# 3. Wrap with caching!
cached_embeddings = CacheBackedEmbeddings.from_bytes_store(
    base,
    store,
    namespace="openai"
)

Why Namespace Matters

# Different models = different caches
cache_openai = CacheBackedEmbeddings.from_bytes_store(
    OpenAIEmbeddings(),
    store,
    namespace="openai"  # Separate!
)

cache_huggingface = CacheBackedEmbeddings.from_bytes_store(
    HuggingFaceEmbeddings(),
    store,
    namespace="huggingface"  # Separate!
)

Cache Storage Options

Storage	Best For
`LocalFileStore`	Simple projects
`RedisStore`	Production apps
`InMemoryStore`	Testing only

Real Savings

import time

# First call - hits API
start = time.time()
cached_embeddings.embed_documents(texts)
print(f"First: {time.time()-start:.2f}s")

# Second call - from cache!
start = time.time()
cached_embeddings.embed_documents(texts)
print(f"Cached: {time.time()-start:.4f}s")

# Output:
# First: 0.45s
# Cached: 0.0012s  ← 375x faster!

📏 Embedding Similarity Metrics

How Do We Compare Embeddings?

Remember our magical library? We need to know: How close are two books?

There are different ways to measure “closeness”:

1. Cosine Similarity (Most Popular!)

Think of two arrows pointing from the center of a circle:

Same direction = 1.0 (identical!)
Opposite direction = -1.0 (opposites)
Perpendicular = 0.0 (unrelated)

graph LR
    A((Center)) --> B["Happy 😊"]
    A --> C["Joyful 🎉"]
    A --> D["Sad 😢"]
    B -.->|0.95| C
    B -.->|-0.3| D

from numpy import dot
from numpy.linalg import norm

def cosine_similarity(a, b):
    return dot(a, b) / (norm(a) * norm(b))

# Compare two embeddings
score = cosine_similarity(
    embed_query("happy"),
    embed_query("joyful")
)
print(score)  # ~0.92 Very similar!

2. Euclidean Distance

Like measuring with a ruler on a map:

Smaller = more similar
Bigger = less similar

from numpy.linalg import norm

def euclidean_distance(a, b):
    return norm(a - b)

dist = euclidean_distance(vec1, vec2)
# 0.0 = identical
# Higher = more different

3. Dot Product

Simple multiplication and sum:

Higher = more similar
Works best with normalized vectors

from numpy import dot

similarity = dot(vec1, vec2)

Which Should You Use?

Metric	When to Use
Cosine	Most cases! Default choice
Euclidean	When magnitude matters
Dot Product	Normalized vectors, speed

Quick Comparison Tool

from langchain_openai import OpenAIEmbeddings
from numpy import dot
from numpy.linalg import norm

embeddings = OpenAIEmbeddings()

def compare(text1, text2):
    v1 = embeddings.embed_query(text1)
    v2 = embeddings.embed_query(text2)

    cos = dot(v1, v2) / (norm(v1) * norm(v2))
    return round(cos, 3)

# Try it!
print(compare("I love cats", "Cats are great"))
# ~0.89

print(compare("I love cats", "The stock market"))
# ~0.23

🎯 Putting It All Together

Here’s a complete mini-example combining everything:

from langchain_openai import OpenAIEmbeddings
from langchain.embeddings import (
    CacheBackedEmbeddings
)
from langchain.storage import LocalFileStore
from numpy import dot
from numpy.linalg import norm

# 1. Configure model
base_embeddings = OpenAIEmbeddings(
    model="text-embedding-3-small"
)

# 2. Add caching
store = LocalFileStore("./my_cache/")
embeddings = CacheBackedEmbeddings.from_bytes_store(
    base_embeddings,
    store,
    namespace="demo"
)

# 3. Create embeddings
docs = [
    "Python is great for AI",
    "JavaScript runs in browsers",
    "Machine learning needs data"
]
doc_vectors = embeddings.embed_documents(docs)

# 4. Search with similarity
query = "Which language for AI?"
q_vector = embeddings.embed_query(query)

# 5. Find best match
for i, doc_vec in enumerate(doc_vectors):
    score = dot(q_vector, doc_vec) / (
        norm(q_vector) * norm(doc_vec)
    )
    print(f"{docs[i]}: {score:.3f}")

# Output:
# Python is great for AI: 0.891  ← Winner!
# JavaScript runs in browsers: 0.654
# Machine learning needs data: 0.823

🚀 Key Takeaways

Concept	Remember This
Embeddings	Turn text into meaning-numbers
Configuration	Pick model by speed/cost/accuracy
Creating	`embed_query` for questions, `embed_documents` for data
Caching	Save time and money — don’t repeat!
Similarity	Cosine similarity = your best friend

You did it! 🎉 You now understand how AI “reads” text by converting it into numbers that capture meaning. This is the foundation of every modern search engine, chatbot, and recommendation system!

Embeddings

Unable to load concept

Coming Soon...

🧭 Vector Search and RAG: The Magic of Embeddings

The Story of the Librarian Who Understood Meaning

🎪 What Are Embeddings? (Overview)

The Simple Idea

Why Numbers?

Real Example

⚙️ Embedding Model Configuration

Picking Your Translator

OpenAI Configuration

HuggingFace (Free!)

Configuration Tips

🛠️ Creating Embeddings

Two Types of Creation

Single Query

Multiple Documents

Quick Example: Finding Similar Text

💾 CacheBackedEmbeddings

The Problem

The Solution: Caching!

How to Use It

Why Namespace Matters

Cache Storage Options

Real Savings

📏 Embedding Similarity Metrics

How Do We Compare Embeddings?

1. Cosine Similarity (Most Popular!)

2. Euclidean Distance

3. Dot Product

Which Should You Use?

Quick Comparison Tool

🎯 Putting It All Together

🚀 Key Takeaways

Story - Premium Content

Stay Tuned!

Story - Premium Content

Interactive - Premium Content

Interactive - Premium Content

Stay Tuned!

Cheatsheet - Premium Content

Cheatsheet - Premium Content

Stay Tuned!

Quiz - Premium Content

Quiz - Premium Content

Stay Tuned!

Flashcard - Premium Content

Flashcard - Premium Content

Stay Tuned!

Sign in Required

Report an Issue