A real production agent rarely uses one memory primitive. It uses three. LangGraph owns the loop and survives crashes. Mem0 owns user-scoped facts that persist across sessions. CLAUDE.md (or any rules file) owns procedural behavior: how this agent should act. When this finishes you have a multi-turn assistant that remembers your preferences across a process restart and still respects its rules file on the way in.
One pip line covers everything.
pip install langgraph langgraph-checkpoint-sqlite langchain-openai mem0ai
Set the OpenAI key. Mem0 calls OpenAI for fact extraction; LangGraph's LLM node uses the same key.
# macOS / Linux export OPENAI_API_KEY="sk-..." # Windows PowerShell $env:OPENAI_API_KEY = "sk-..."
Procedural memory is just a markdown file the agent reads. No vector store, no API. The CLAUDE.md pattern, scoped to this lab.
# Agent rules - Always greet by the user's first name if you know it. - If the user has stated a package-manager preference, mention it unprompted when the topic is JavaScript tooling. - Never recommend npm if the user has switched to pnpm. - Be terse. One paragraph max per reply.
One graph, two nodes: load_context (reads Mem0 + rules.md, builds a system prompt) and respond (calls the LLM). The checkpointer persists the conversation state. Mem0 persists user facts. The rules file is read fresh every turn.
import os
import sys
from pathlib import Path
from typing import Annotated, TypedDict
from operator import add
sys.stdout.reconfigure(encoding="utf-8")
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.sqlite import SqliteSaver
from langchain_openai import ChatOpenAI
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage
from mem0 import Memory
USER_ID = "ray"
RULES_PATH = Path(__file__).parent / "rules.md"
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.2)
mem = Memory()
class State(TypedDict):
user_input: str
messages: Annotated[list, add]
reply: str
def load_context(state: State) -> dict:
rules = RULES_PATH.read_text(encoding="utf-8")
hits = mem.search(state["user_input"], user_id=USER_ID, limit=5)
facts = [h["memory"] for h in hits.get("results", [])]
fact_block = "\n".join(f"- {f}" for f in facts) if facts else "(none yet)"
system_text = (
"You are a coding assistant.\n\n"
"## Rules\n"
f"{rules}\n\n"
"## Known facts about this user\n"
f"{fact_block}\n"
)
return {"messages": [SystemMessage(content=system_text)]}
def respond(state: State) -> dict:
convo = state["messages"] + [HumanMessage(content=state["user_input"])]
answer = llm.invoke(convo)
# store the exchange as a memory candidate; Mem0 dedups / extracts facts.
mem.add(
f"User said: {state['user_input']}\nAssistant replied: {answer.content}",
user_id=USER_ID,
)
return {"messages": [HumanMessage(content=state["user_input"]), answer], "reply": answer.content}
builder = StateGraph(State)
builder.add_node("load_context", load_context)
builder.add_node("respond", respond)
builder.add_edge(START, "load_context")
builder.add_edge("load_context", "respond")
builder.add_edge("respond", END)
saver = SqliteSaver.from_conn_string("hybrid.sqlite")
graph = builder.compile(checkpointer=saver)
def turn(user_input: str, thread_id: str = "main") -> str:
config = {"configurable": {"thread_id": thread_id}}
final = graph.invoke({"user_input": user_input, "messages": [], "reply": ""}, config=config)
return final["reply"]
if __name__ == "__main__":
import argparse
p = argparse.ArgumentParser()
p.add_argument("message", nargs="+")
p.add_argument("--thread", default="main")
args = p.parse_args()
text = " ".join(args.message)
print(f"> {text}")
print(turn(text, thread_id=args.thread))
First turn introduces yourself and states a preference. Second turn asks a question. Then exit the process, start a new one, and ask a follow-up. The rules file is re-read; Mem0 still has the preference; LangGraph's checkpointer still has the message history.
python agent.py "Hey, I am Rayyan. I switched from npm to pnpm last week." # (separate process, fresh Python) python agent.py "What's the fastest way to add a new dep?" # (separate process again, fresh Python) python agent.py "Remind me what package manager I use."
> Hey, I am Rayyan. I switched from npm to pnpm last week. Hey Rayyan -- noted, pnpm it is. > What's the fastest way to add a new dep? Use `pnpm add <pkg>` (you mentioned you switched to pnpm). It is the fastest path... > Remind me what package manager I use. You use pnpm.
The third process is brand new. Python booted from scratch. The agent still knows your name and your package manager because Mem0 persisted those facts to disk and the rules file told the agent to surface them.
LangGraph + SqliteSaver persisted the message thread for thread_id="main". If the third turn was on the same thread, it picked up the prior messages list. (Note: this lab passes messages: [] on every invoke() because we want each turn to be self-contained; the saver still tracks every step for replay.)
Mem0 extracted "user is Rayyan" and "user prefers pnpm" from turn 1, stored them under user_id="ray", and turn 3's load_context pulled them back in. This is the layer that survives a deleted SQLite file.
rules.md was re-read on every turn. Change a rule, the next turn obeys the new rule. No restart, no re-deploy, no embedding refresh.
Most production agent failures are confusion about which layer should hold what. "Why did my agent forget my preferences?" usually means the preferences were stored in graph state (gone on restart) instead of in a Mem0-style layer. "Why is my agent ignoring the rule I just added?" usually means the rule is in graph state instead of a rules file that gets re-read.
The rule of thumb: if it survives a deleted SQLite, it is Mem0/Letta/Zep. If it survives a deleted Mem0 store, it is your rules file in git. If it does not survive Ctrl-C, it never made it past graph state.
Agent reintroduces itself every turn. Mem0 takes a beat on first add() to download embedding models. Run turn 1 once and wait a few seconds before turn 2. Subsequent runs are fast.
Rules.md not found. The path is computed relative to agent.py. Run from the same directory or change RULES_PATH to an absolute path.
Mem0 stored the entire transcript instead of facts. Mem0 extracts opportunistically; not every add() yields a fact. Use m.get_all(user_id="ray") to inspect what stuck. If you want stricter control, pass infer=False to add() and provide the canonical fact text yourself.