Skip to main content

Memory Orchestration

The heart of Cortex: automatic orchestration across all memory layers with batteries-included defaults.


Overview

Cortex is an AI memory orchestration platform that manages the complete lifecycle of agent memory. Instead of manually coordinating conversations, vector stores, facts, and graphs, you call a single method and Cortex handles everything.

Building AI agents with persistent memory typically requires:

// Without Cortex: Manual coordination
await storeConversation(conversationId, userMessage, agentResponse);
await generateEmbedding(userMessage + agentResponse);
await storeInVectorDB(embedding, content, metadata);
await extractFacts(userMessage, agentResponse);
await storeFacts(facts, conversationId);
// Graph sync handled automatically when CORTEX_GRAPH_SYNC=true (v0.29.0+)
await linkEverythingTogether(ids);
// ...and handle failures, retries, consistency

Core Concept: The 4-Layer Architecture

Cortex organizes memory into four layers, each serving a specific purpose:

Memory Orchestration Layers
Layer 4: Convenience API

cortex.memory.remember() • cortex.memory.recall() • Start here

Layer 3: Facts Store

LLM-extracted knowledge • Belief revision • 60-90% token savings

Layer 2: Vector Index

Semantic search • Embedded memories • Fast retrieval

Layer 1: ACID Stores

Conversations (L1a) • Immutable KB (L1b) • Mutable Config (L1c)

Graph Database (Optional)

Neo4j/Memgraph • Multi-hop traversal • Entity relationships

Data flows from convenience API through extraction to storage

The Four Layers

Purpose: Developer experience - orchestrates all layers

  • cortex.memory.* handles everything automatically
  • Recommended starting point for all use cases
// Does L1a + L2 + L3 + graph sync
await cortex.memory.remember({...});

// Searches L2 + L3 + graph expansion
await cortex.memory.recall({...});
Which Layer Should I Use?

Start with Layer 4 (cortex.memory.*). It handles all orchestration for you.

Use lower layers only when you need:

  • Direct vector control (Layer 2)
  • Custom fact management (Layer 3)
  • Raw conversation access (Layer 1)

The Two Core Methods

remember() - Store with Full Orchestration

remember() is the primary method for storing conversations. It orchestrates all layers automatically:

const result = await cortex.memory.remember({
// Required
memorySpaceId: "user-123-personal",
conversationId: "conv-456",
userMessage: "My favorite color is blue and I work at Acme Corp",
agentResponse: "Nice! I'll remember your preference and workplace.",
userId: "user-123",
userName: "Alex",

// Optional: Custom fact extraction
extractFacts: async (userMsg, agentResp) => [
{
fact: "User's favorite color is blue",
factType: "preference",
subject: "user-123",
predicate: "favoriteColor",
object: "blue",
confidence: 95,
},
],
});

What remember() does automatically:

1

Validates inputs

Checks memorySpaceId, userId/agentId ownership

2

Auto-registers entities

Creates memory space, user profile, agent if needed

3

Stores in Layer 1a

Adds messages to ACID conversation store

4

Indexes in Layer 2

Creates vector memories with conversationRef

5

Extracts facts (Layer 3)

If LLM configured, extracts facts with belief revision

6

Syncs to graph

If graph adapter configured, syncs entities and relationships

recall() - Retrieve with Full Orchestration

recall() searches across all layers, merges results, and returns ready-to-use context:

const result = await cortex.memory.recall({
memorySpaceId: "user-123-personal",
query: "What are user's preferences?",

// Optional: Pre-computed embedding for better semantic search
embedding: await embed("What are user's preferences?"),

// Optional: Filter options
userId: "user-123",
limit: 10,
});

// Result includes unified context from all layers
console.log(result.items); // Merged, deduped, ranked results
console.log(result.sources.vector); // Vector memories breakdown
console.log(result.sources.facts); // Structured facts breakdown
console.log(result.sources.graph); // Graph-expanded entities
console.log(result.context); // Ready-to-inject LLM context string

What recall() does automatically:

1

Searches vector memories (Layer 2)

Semantic search with optional embedding

2

Searches facts directly (Layer 3)

Facts as primary source, not just enrichment

3

Queries graph relationships

If configured, discovers related context via entity connections

4

Merges and ranks results

Multi-signal scoring algorithm with deduplication

5

Formats for LLM injection

Returns ready-to-use context string


Batteries-Included Defaults

Both remember() and recall() work out-of-the-box with sensible defaults:

remember() Defaults

FeatureDefaultWhat It Does
Memory SpaceAuto-registerCreates space if it doesn't exist
User ProfileAuto-createCreates user profile for new users
ConversationAlways storedMessages saved to ACID layer
Vector MemoryAlways createdSearchable memory entry
FactsExtracted if configuredLLM extracts structured facts
Graph SyncEnabled if configuredSyncs entities and relationships
Belief RevisionEnabled if LLM configuredHandles conflicting facts

recall() Defaults

FeatureDefaultWhat It Does
Vector SearchEnabledSearches Layer 2
Facts SearchEnabledSearches Layer 3 directly
Graph ExpansionEnabled if configuredDiscovers related context
LLM ContextGeneratedReady-to-inject string
DeduplicationEnabledRemoves duplicates
RankingEnabledMulti-signal scoring

Quick Start

Basic Usage

import { Cortex } from "@cortexmemory/sdk";

const cortex = new Cortex({
convexUrl: process.env.CONVEX_URL!,
});

// Store a conversation
await cortex.memory.remember({
memorySpaceId: "my-agent",
conversationId: "conv-123",
userMessage: "Hello! I'm building a Next.js app",
agentResponse: "Great! I can help with Next.js development.",
userId: "user-1",
userName: "Developer",
});

// Retrieve relevant context
const result = await cortex.memory.recall({
memorySpaceId: "my-agent",
query: "What is the user working on?",
});

console.log(result.context);
// "User is building a Next.js app..."

With Embeddings

import { embed } from "@ai-sdk/openai";

await cortex.memory.remember({
memorySpaceId: "my-agent",
conversationId: "conv-123",
userMessage: "I prefer TypeScript over JavaScript",
agentResponse: "TypeScript is great for type safety!",
userId: "user-1",
userName: "Developer",

// Add embedding for semantic search
generateEmbedding: async (content) => {
const { embedding } = await embed({
model: openai.embedding("text-embedding-3-small"),
value: content,
});
return embedding;
},
});

Streaming Support

For real-time AI responses, use rememberStream():

import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";

const response = await streamText({
model: openai("gpt-4"),
messages: [{ role: "user", content: "Tell me about AI" }],
});

// Store streaming response automatically
const result = await cortex.memory.rememberStream({
memorySpaceId: "my-agent",
conversationId: "conv-123",
userMessage: "Tell me about AI",
responseStream: response.textStream,
userId: "user-1",
userName: "User",
});

console.log(result.fullResponse); // Complete response text

The Power of Orchestration

Automatic Cross-Layer Linking

When you call remember(), all layers are automatically linked:

const result = await cortex.memory.remember({
memorySpaceId: "my-agent",
conversationId: "conv-123",
userMessage: "I like Python",
agentResponse: "Python is great!",
userId: "user-1",
userName: "Dev",
});

// Result contains IDs from all layers
console.log(result.conversation); // ACID message IDs
console.log(result.memories); // Vector memory entries
console.log(result.facts); // Extracted facts

// Each vector memory links back to its ACID source
// Each fact links to its source conversation
// Graph nodes link to their Convex counterparts

GDPR Cascade Deletion

One call deletes from all layers:

// Delete user and all their data across ALL layers
await cortex.users.delete("user-123", { cascade: true });

// Automatically deletes from:
// - User profile
// - All conversations
// - All vector memories
// - All facts
// - All graph nodes

Multi-Tenancy

All operations are automatically scoped to the tenant:

const cortex = new Cortex({
convexUrl: process.env.CONVEX_URL!,
auth: {
userId: "user-123",
tenantId: "tenant-acme", // All operations scoped to this tenant
},
});

// Everything is automatically filtered by tenant
await cortex.memory.remember({...}); // Stored in tenant-acme
await cortex.memory.recall({...}); // Only searches tenant-acme

API Namespaces


Advanced: Skipping Layers

For specific use cases, you can skip certain layers:

await cortex.memory.remember({
memorySpaceId: "my-agent",
conversationId: "conv-123",
userMessage: "Quick note",
agentResponse: "Got it!",
userId: "user-1",
userName: "User",

// Skip specific layers
skipLayers: ["facts", "graph"],
});
SkipEffect
usersDon't auto-create user profile
agentsDon't auto-register agent
conversationsDon't store in ACID layer
vectorDon't create vector memory
factsDon't extract facts
graphDon't sync to graph

Summary

Memory Orchestration is the core value of Cortex:

  • One method to store (remember()) - Handles all layers automatically
  • One method to retrieve (recall()) - Unified search across all sources
  • Batteries included - Sensible defaults, minimal configuration
  • Full control when needed - Drop to lower layers for specific use cases
  • Enterprise ready - GDPR, multi-tenancy, resilience built-in

Next Steps