Memory Orchestration
The heart of Cortex: automatic orchestration across all memory layers with batteries-included defaults.
Overview
Cortex is an AI memory orchestration platform that manages the complete lifecycle of agent memory. Instead of manually coordinating conversations, vector stores, facts, and graphs, you call a single method and Cortex handles everything.
Building AI agents with persistent memory typically requires:
// Without Cortex: Manual coordination
await storeConversation(conversationId, userMessage, agentResponse);
await generateEmbedding(userMessage + agentResponse);
await storeInVectorDB(embedding, content, metadata);
await extractFacts(userMessage, agentResponse);
await storeFacts(facts, conversationId);
// Graph sync handled automatically when CORTEX_GRAPH_SYNC=true (v0.29.0+)
await linkEverythingTogether(ids);
// ...and handle failures, retries, consistency
// With Cortex: One call, full orchestration
await cortex.memory.remember({
memorySpaceId: "user-123-personal",
conversationId: "conv-456",
userMessage: "I prefer TypeScript for backend work",
agentResponse: "I'll remember that preference!",
userId: "user-123",
userName: "Alex",
});
// Conversations, vectors, facts, and graph all updated automatically
Core Concept: The 4-Layer Architecture
Cortex organizes memory into four layers, each serving a specific purpose:
cortex.memory.remember() • cortex.memory.recall() • Start here
LLM-extracted knowledge • Belief revision • 60-90% token savings
Semantic search • Embedded memories • Fast retrieval
Conversations (L1a) • Immutable KB (L1b) • Mutable Config (L1c)
Neo4j/Memgraph • Multi-hop traversal • Entity relationships
The Four Layers
Purpose: Developer experience - orchestrates all layers
cortex.memory.*handles everything automatically- Recommended starting point for all use cases
// Does L1a + L2 + L3 + graph sync
await cortex.memory.remember({...});
// Searches L2 + L3 + graph expansion
await cortex.memory.recall({...});
Purpose: Extracted knowledge with 60-90% token savings
- LLM-extracted structured facts
- Versioned and searchable
- Belief revision for conflicting facts
await cortex.facts.store(memorySpaceId, {
fact: "User prefers dark mode",
factType: "preference",
confidence: 95,
});
Purpose: Semantic search
- Embedded memories for fast retrieval
- References Layer 1 stores
- Scoped per memory space
await cortex.vector.store(memorySpaceId, {
content: "...",
embedding: [...],
conversationRef: {...}, // Links to L1a
});
await cortex.vector.search(memorySpaceId, query);
Purpose: Immutable source of truth
Three sub-stores:
- L1a: Conversations - Raw message history (per memory space)
- L1b: Immutable - KB articles, policies (shared)
- L1c: Mutable - Config, counters (shared, live)
// Layer 1a
await cortex.conversations.addMessage(conversationId, {...});
// Layer 1b
await cortex.immutable.store({...});
// Layer 1c
await cortex.mutable.set(namespace, key, value);
Start with Layer 4 (cortex.memory.*). It handles all orchestration for you.
Use lower layers only when you need:
- Direct vector control (Layer 2)
- Custom fact management (Layer 3)
- Raw conversation access (Layer 1)
The Two Core Methods
remember() - Store with Full Orchestration
remember() is the primary method for storing conversations. It orchestrates all layers automatically:
const result = await cortex.memory.remember({
// Required
memorySpaceId: "user-123-personal",
conversationId: "conv-456",
userMessage: "My favorite color is blue and I work at Acme Corp",
agentResponse: "Nice! I'll remember your preference and workplace.",
userId: "user-123",
userName: "Alex",
// Optional: Custom fact extraction
extractFacts: async (userMsg, agentResp) => [
{
fact: "User's favorite color is blue",
factType: "preference",
subject: "user-123",
predicate: "favoriteColor",
object: "blue",
confidence: 95,
},
],
});
What remember() does automatically:
Validates inputs
Checks memorySpaceId, userId/agentId ownership
Auto-registers entities
Creates memory space, user profile, agent if needed
Stores in Layer 1a
Adds messages to ACID conversation store
Indexes in Layer 2
Creates vector memories with conversationRef
Extracts facts (Layer 3)
If LLM configured, extracts facts with belief revision
Syncs to graph
If graph adapter configured, syncs entities and relationships
recall() - Retrieve with Full Orchestration
recall() searches across all layers, merges results, and returns ready-to-use context:
const result = await cortex.memory.recall({
memorySpaceId: "user-123-personal",
query: "What are user's preferences?",
// Optional: Pre-computed embedding for better semantic search
embedding: await embed("What are user's preferences?"),
// Optional: Filter options
userId: "user-123",
limit: 10,
});
// Result includes unified context from all layers
console.log(result.items); // Merged, deduped, ranked results
console.log(result.sources.vector); // Vector memories breakdown
console.log(result.sources.facts); // Structured facts breakdown
console.log(result.sources.graph); // Graph-expanded entities
console.log(result.context); // Ready-to-inject LLM context string
What recall() does automatically:
Searches vector memories (Layer 2)
Semantic search with optional embedding
Searches facts directly (Layer 3)
Facts as primary source, not just enrichment
Queries graph relationships
If configured, discovers related context via entity connections
Merges and ranks results
Multi-signal scoring algorithm with deduplication
Formats for LLM injection
Returns ready-to-use context string
Batteries-Included Defaults
Both remember() and recall() work out-of-the-box with sensible defaults:
remember() Defaults
| Feature | Default | What It Does |
|---|---|---|
| Memory Space | Auto-register | Creates space if it doesn't exist |
| User Profile | Auto-create | Creates user profile for new users |
| Conversation | Always stored | Messages saved to ACID layer |
| Vector Memory | Always created | Searchable memory entry |
| Facts | Extracted if configured | LLM extracts structured facts |
| Graph Sync | Enabled if configured | Syncs entities and relationships |
| Belief Revision | Enabled if LLM configured | Handles conflicting facts |
recall() Defaults
| Feature | Default | What It Does |
|---|---|---|
| Vector Search | Enabled | Searches Layer 2 |
| Facts Search | Enabled | Searches Layer 3 directly |
| Graph Expansion | Enabled if configured | Discovers related context |
| LLM Context | Generated | Ready-to-inject string |
| Deduplication | Enabled | Removes duplicates |
| Ranking | Enabled | Multi-signal scoring |
Quick Start
Basic Usage
import { Cortex } from "@cortexmemory/sdk";
const cortex = new Cortex({
convexUrl: process.env.CONVEX_URL!,
});
// Store a conversation
await cortex.memory.remember({
memorySpaceId: "my-agent",
conversationId: "conv-123",
userMessage: "Hello! I'm building a Next.js app",
agentResponse: "Great! I can help with Next.js development.",
userId: "user-1",
userName: "Developer",
});
// Retrieve relevant context
const result = await cortex.memory.recall({
memorySpaceId: "my-agent",
query: "What is the user working on?",
});
console.log(result.context);
// "User is building a Next.js app..."
With Embeddings
import { embed } from "@ai-sdk/openai";
await cortex.memory.remember({
memorySpaceId: "my-agent",
conversationId: "conv-123",
userMessage: "I prefer TypeScript over JavaScript",
agentResponse: "TypeScript is great for type safety!",
userId: "user-1",
userName: "Developer",
// Add embedding for semantic search
generateEmbedding: async (content) => {
const { embedding } = await embed({
model: openai.embedding("text-embedding-3-small"),
value: content,
});
return embedding;
},
});
Streaming Support
For real-time AI responses, use rememberStream():
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";
const response = await streamText({
model: openai("gpt-4"),
messages: [{ role: "user", content: "Tell me about AI" }],
});
// Store streaming response automatically
const result = await cortex.memory.rememberStream({
memorySpaceId: "my-agent",
conversationId: "conv-123",
userMessage: "Tell me about AI",
responseStream: response.textStream,
userId: "user-1",
userName: "User",
});
console.log(result.fullResponse); // Complete response text
The Power of Orchestration
Automatic Cross-Layer Linking
When you call remember(), all layers are automatically linked:
const result = await cortex.memory.remember({
memorySpaceId: "my-agent",
conversationId: "conv-123",
userMessage: "I like Python",
agentResponse: "Python is great!",
userId: "user-1",
userName: "Dev",
});
// Result contains IDs from all layers
console.log(result.conversation); // ACID message IDs
console.log(result.memories); // Vector memory entries
console.log(result.facts); // Extracted facts
// Each vector memory links back to its ACID source
// Each fact links to its source conversation
// Graph nodes link to their Convex counterparts
GDPR Cascade Deletion
One call deletes from all layers:
// Delete user and all their data across ALL layers
await cortex.users.delete("user-123", { cascade: true });
// Automatically deletes from:
// - User profile
// - All conversations
// - All vector memories
// - All facts
// - All graph nodes
Multi-Tenancy
All operations are automatically scoped to the tenant:
const cortex = new Cortex({
convexUrl: process.env.CONVEX_URL!,
auth: {
userId: "user-123",
tenantId: "tenant-acme", // All operations scoped to this tenant
},
});
// Everything is automatically filtered by tenant
await cortex.memory.remember({...}); // Stored in tenant-acme
await cortex.memory.recall({...}); // Only searches tenant-acme
API Namespaces
cortex.memory.*
Layer 4: Convenience API (start here)
cortex.vector.*
Layer 2: Direct vector operations
cortex.facts.*
Layer 3: Fact operations
cortex.conversations.*
Layer 1a: ACID conversations
cortex.immutable.*
Layer 1b: Shared knowledge
cortex.mutable.*
Layer 1c: Live state
Advanced: Skipping Layers
For specific use cases, you can skip certain layers:
await cortex.memory.remember({
memorySpaceId: "my-agent",
conversationId: "conv-123",
userMessage: "Quick note",
agentResponse: "Got it!",
userId: "user-1",
userName: "User",
// Skip specific layers
skipLayers: ["facts", "graph"],
});
| Skip | Effect |
|---|---|
users | Don't auto-create user profile |
agents | Don't auto-register agent |
conversations | Don't store in ACID layer |
vector | Don't create vector memory |
facts | Don't extract facts |
graph | Don't sync to graph |
Summary
Memory Orchestration is the core value of Cortex:
- One method to store (
remember()) - Handles all layers automatically - One method to retrieve (
recall()) - Unified search across all sources - Batteries included - Sensible defaults, minimal configuration
- Full control when needed - Drop to lower layers for specific use cases
- Enterprise ready - GDPR, multi-tenancy, resilience built-in