The 7 Types of Agent Memory: A Technical Guide for AI Engineers

Large language models are stateless by default. Each API call starts fresh. The model forgets your last message once the response returns. That is fine for a single question. It breaks the moment you build an agent.

Agents plan, call tools, and run across many steps. They need to remember. Memory is the infrastructure that fixes this. It turns a stateless model into a system that retains context. That system can learn from experience and act over time.

What is Agent Memory

Memory is any mechanism that carries information across a model’s reasoning. Some of it lives inside the context window. Some of it lives outside, in databases or model weights. Each type stores a different class of information for a different duration.

Memory varies by form and by time. Form means parametric, stored in weights, or non-parametric, stored as text. Time means short-term or long-term. The seven types below map onto those two axes.

The Seven Types of Agent Memory

1. In-Context / Working Memory (Short-Term): This is everything the model can currently see inside its context window. It includes the system prompt, recent messages, tool outputs, and reasoning steps. Think of it as RAM. It is fast and essential, but temporary and size-limited. Every other memory type competes for space here.

2. Semantic Memory (Long-Term): This is a persistent store of facts, preferences, and domain knowledge. It holds entries like “the user prefers Python over JavaScript.” The knowledge is decoupled from when it was learned. It is the agent’s organized encyclopedia about a user or topic.

3. Episodic Memory (Long-Term): This logs specific past events, full conversations, and task runs. It records what worked and what failed. The agent uses it to learn from experience. Systems like Reflexion and ExpeL write verbal post-mortems and store conclusions for future runs.

4. Procedural Memory (Long-Term): This is the agent’s knowledge of how to do things. It covers skills, tool usage patterns, workflows, and behavioral rules. A support agent handling its hundredth password reset does not re-reason the workflow. It executes a learned procedure instead.

5. External / Retrieval Memory (Short-Term + Long-Term): This is knowledge stored outside the model in a vector database. It is pulled into context at inference time using similarity search. This is RAG applied to agent history or documents. Retrieval quality becomes the bottleneck fast.

6. Parametric Memory (Long-Term): This is knowledge baked directly into the model’s weights during training. It holds language, reasoning patterns, and general world knowledge. The model does not look anything up. It generates from learned associations. The tradeoff is that this memory is frozen at training time.

7. Prospective Memory (Short-Term + Long-Term): This is the agent’s ability to remember future intentions and scheduled goals. It tracks things the agent planned but has not yet executed. It is critical for long-horizon and multi-step planning agents. Without it, an agent forgets its own commitments.

Side-by-Side: How the Seven Compare

The table below maps each type to its timescale, location, and typical implementation.

Memory typeTimescaleWhere it livesWhat it storesCommon implementationWorking / In-contextShort-termContext windowPrompt, messages, tool outputsNative to the LLMSemanticLong-termExternal storeFacts, preferences, domain knowledgeVector DB or profile schemaEpisodicLong-termExternal storePast events, task runs, outcomesVector DB plus event logsProceduralLong-termPrompt or weightsSkills, workflows, behavioral rulesSystem prompt or fine-tuneRetrieval / ExternalBothVector databaseDocuments, history chunksRAG pipelineParametricLong-termModel weightsWorld knowledge, language, reasoningPre-training or fine-tuningProspectiveBothState storeFuture intentions, scheduled goalsTask queue or scheduler

Interactive Explainer