๐ LangServe: Your AIโs Delivery Service
Imagine you baked the worldโs most amazing cake. Now you need to deliver it to everyone who wants a slice. LangServe is your delivery truck!
๐ฏ The Big Picture
Youโve built an awesome AI chain with LangChain. It works perfectly on your computer. But how do you share it with the world?
LangServe is like turning your local bakery into a delivery service. Anyone can order your AI โcakesโ through the internet!
graph TD A["๐ง Your LangChain App"] --> B["๐ฆ LangServe Wraps It"] B --> C["๐ REST API Created"] C --> D["๐ฑ Anyone Can Use It!"]
๐ LangServe Deployment
What is Deployment?
Think of it like this:
- Development = Cooking in your kitchen
- Deployment = Opening a restaurant for customers
LangServe helps you open that restaurant!
The Simple Setup
# Install LangServe
# pip install langserve
from fastapi import FastAPI
from langserve import add_routes
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
# Create your AI chain
prompt = ChatPromptTemplate.from_template(
"Tell me a joke about {topic}"
)
model = ChatOpenAI()
chain = prompt | model
# Create the restaurant (FastAPI app)
app = FastAPI(
title="Joke Server",
description="Get AI jokes!"
)
# Add the menu item (your chain)
add_routes(app, chain, path="/jokes")
What add_routes Does
Itโs like adding a dish to your menu. This one line gives you:
| Endpoint | What It Does |
|---|---|
/jokes/invoke |
Get one response |
/jokes/batch |
Get many responses |
/jokes/stream |
Get response piece by piece |
/jokes/playground |
Test it in a browser! |
The Playground is like a tasting booth where anyone can try your AI before using it in their app!
๐ REST API Creation
Whatโs a REST API?
Imagine a waiter at a restaurant:
- You ask for something (request)
- The kitchen makes it (processing)
- The waiter brings it back (response)
A REST API is your digital waiter!
How LangServe Creates APIs
# Your chain becomes these endpoints:
# POST /jokes/invoke
# Send: {"input": {"topic": "cats"}}
# Get: {"output": "Why did the cat sit on
# the computer? To keep an eye on
# the mouse!"}
# POST /jokes/batch
# Send: {"inputs": [
# {"topic": "cats"},
# {"topic": "dogs"}
# ]}
# Get: {"output": ["cat joke", "dog joke"]}
Running Your Server
# At the end of your file:
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
Now visit http://localhost:8000/docs to see your API documentation!
graph TD A[๐ฑ User's App] -->|POST request| B["๐ช /jokes/invoke"] B --> C["โ๏ธ Your Chain Runs"] C --> D["๐ค JSON Response"] D --> A
๐ RemoteRunnable Client
Whatโs RemoteRunnable?
Remember when we talked about the restaurant?
- LangServe = The restaurant
- RemoteRunnable = The phone for delivery orders!
It lets any Python app call your LangServe API like itโs a local chain.
Magic Simplicity
from langserve import RemoteRunnable
# Connect to the restaurant
joke_chain = RemoteRunnable(
"http://localhost:8000/jokes"
)
# Use it like a normal chain!
result = joke_chain.invoke({"topic": "pizza"})
print(result)
Why This is Amazing
| Without RemoteRunnable | With RemoteRunnable |
|---|---|
| Write HTTP code | Just .invoke() |
| Handle JSON yourself | Automatic! |
| Parse responses | Already done |
| Complex! | Simple! |
The Full Picture
# On the SERVER (Restaurant)
from langserve import add_routes
add_routes(app, my_chain, path="/ai")
# On the CLIENT (Customer's house)
from langserve import RemoteRunnable
my_chain = RemoteRunnable("http://server:8000/ai")
# Both use the SAME interface!
result = my_chain.invoke({"input": "hello"})
Itโs like magic! Your remote AI feels like itโs running locally.
๐ Streaming in Production
Why Streaming Matters
Imagine ordering pizza:
- Without streaming: Wait 30 minutes. Get whole pizza at once.
- With streaming: Watch them make it. Get slices as ready!
For AI, streaming means seeing words appear one by one, like someone typing!
Server-Side Streaming
# LangServe automatically supports streaming!
# Just use /stream endpoint
# POST /jokes/stream
# Response comes in chunks:
# {"content": "Why"}
# {"content": " did"}
# {"content": " the"}
# {"content": " chicken..."}
Client-Side Streaming
from langserve import RemoteRunnable
chain = RemoteRunnable("http://localhost:8000/jokes")
# Stream the response!
for chunk in chain.stream({"topic": "robots"}):
print(chunk, end="", flush=True)
# Words appear one by one!
Async Streaming (For Web Apps)
async for chunk in chain.astream({"topic": "space"}):
print(chunk, end="")
# Non-blocking streaming!
graph LR A["๐ค AI Generates"] -->|chunk 1| B["๐ฑ User Sees"] A -->|chunk 2| B A -->|chunk 3| B A -->|chunk 4| B style A fill:#e1f5fe style B fill:#c8e6c9
Why Users Love Streaming
- Feels faster (even if same total time)
- More engaging (watch AI โthinkโ)
- Better UX (no staring at blank screen)
๐ Production Best Practices
1. Always Add Authentication
Donโt let strangers eat your cake for free!
from fastapi import Depends, HTTPException
from fastapi.security import APIKeyHeader
API_KEY = "your-secret-key"
api_key_header = APIKeyHeader(name="X-API-Key")
def verify_key(api_key: str = Depends(api_key_header)):
if api_key != API_KEY:
raise HTTPException(status_code=403)
return api_key
# Protected route
add_routes(
app,
chain,
path="/jokes",
dependencies=[Depends(verify_key)]
)
2. Set Rate Limits
Donโt let one customer order 1000 pizzas!
from slowapi import Limiter
from slowapi.util import get_remote_address
limiter = Limiter(key_func=get_remote_address)
# 10 requests per minute
@limiter.limit("10/minute")
@app.post("/custom-endpoint")
async def custom(request: Request):
# Your code here
pass
3. Add Health Checks
Like checking if the restaurant is open:
@app.get("/health")
def health():
return {"status": "healthy"}
4. Use Environment Variables
Never put secrets in code!
import os
# Bad! Don't do this!
# api_key = "sk-secret123"
# Good! Do this!
api_key = os.getenv("OPENAI_API_KEY")
5. Enable CORS (for web apps)
from fastapi.middleware.cors import CORSMiddleware
app.add_middleware(
CORSMiddleware,
allow_origins=["https://yourapp.com"],
allow_methods=["POST"],
allow_headers=["*"],
)
Production Checklist
| Item | Why It Matters |
|---|---|
| โ Authentication | Protect your API |
| โ Rate Limiting | Prevent abuse |
| โ Health Checks | Monitor uptime |
| โ Env Variables | Keep secrets safe |
| โ CORS Setup | Enable web access |
| โ Logging | Debug problems |
| โ Error Handling | Graceful failures |
๐ฌ The Complete Example
Hereโs a production-ready LangServe app:
import os
from fastapi import FastAPI, Depends
from fastapi.security import APIKeyHeader
from fastapi.middleware.cors import CORSMiddleware
from langserve import add_routes
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
# Setup
app = FastAPI(title="My AI API")
# CORS
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_methods=["*"],
allow_headers=["*"],
)
# Auth
api_key = APIKeyHeader(name="X-API-Key")
def check_key(key: str = Depends(api_key)):
if key != os.getenv("MY_API_KEY"):
raise HTTPException(403)
# Chain
prompt = ChatPromptTemplate.from_template(
"You are helpful. Answer: {question}"
)
model = ChatOpenAI()
chain = prompt | model
# Routes
add_routes(
app, chain,
path="/chat",
dependencies=[Depends(check_key)]
)
# Health
@app.get("/health")
def health():
return {"status": "ok"}
# Run
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
๐ Key Takeaways
- LangServe turns your chain into a web service
- add_routes gives you invoke, batch, stream, and playground
- RemoteRunnable lets clients use your API like a local chain
- Streaming makes AI feel responsive and engaging
- Production needs auth, rate limits, and proper config
graph TD A["Build Chain"] --> B["Wrap with LangServe"] B --> C["Deploy to Server"] C --> D["Clients Use RemoteRunnable"] D --> E["Stream Responses"] E --> F["Happy Users! ๐"]
๐ Youโre Ready!
You now know how to:
- Deploy LangChain apps with LangServe
- Create REST APIs automatically
- Connect clients with RemoteRunnable
- Stream responses for better UX
- Follow production best practices
Your AI is ready to serve the world! ๐
