Skip to main content

Core Concepts

Understanding these core concepts will help you make the most of Cortex. This guide covers the fundamental building blocks of the system.

Cortex Cloud Coming Soon

Cortex SDK is available now for self-hosted deployments. Cortex Cloud—a managed service with enhanced analytics, automatic scaling, and zero-config setup—is coming soon.

Key Concepts at a Glance


Memory Spaces

What is a Memory Space?

A memory space is the fundamental isolation boundary in Cortex. Think of it as a private namespace where memories, facts, and conversations are stored.

Think of Memory Spaces Like...

A memory space is like a personal hard drive or team workspace. Everything inside is isolated from other spaces, but authorized agents can read and write freely within the space.

Previous Terminology

We used to call these "agents" - but that was confusing because multiple agents (or tools) can share one memory space!

types.ts
interface MemorySpace {
id: string; // e.g., "user-123-personal" or "team-engineering"
name?: string; // Human-readable name
type: "personal" | "team" | "project"; // Organization type
agents: string[]; // Agents/tools operating in this space
createdAt: Date;
}

Key Concept: Memory Space = Isolation Boundary

Key Distinction

memorySpaceId is the isolation boundary, NOT agentId. Multiple agents can share a memory space (Hive Mode) or have separate spaces (Collaboration Mode).

example.ts
// Every memory operation requires a memorySpaceId
await cortex.memory.remember({
memorySpaceId: "user-123-personal", // ← Isolation boundary
agentId: "cursor", // ← Optional for H2A, required for A2A
conversationId: "conv-123",
userMessage: "I prefer TypeScript",
agentResponse: "Noted!",
userId: "user-123",
userName: "User",
});

What's Isolated vs Shared

Stored within each memory space:

  • Layer 1a: Conversations (raw message history)
  • Layer 2: Vector memories (embeddings + search)
  • Layer 3: Facts (LLM-extracted knowledge)
  • Layer 4: Convenience API results

Why Memory Spaces?

AspectBefore (Agent-Centric)After (Memory-Space-Centric)
ArchitectureEach agent had separate memoriesTools share a memory space
ExampleCursor stores "User prefers TypeScript"Cursor stores in user-123-personal
Problem/SolutionClaude can't see it (different agent)Claude reads from user-123-personal
ResultUser repeats preferences to every toolMemory follows user across tools

Creating Memory Spaces

implicit-creation.ts
// Just use memorySpaceId - space created automatically
await cortex.memory.remember({
memorySpaceId: "user-123-personal", // Created on first use
conversationId: "conv-123",
userMessage: "Hello",
agentResponse: "Hi!",
userId: "user-123",
userName: "Alice",
});

Learn more:


Hive Mode vs Collaboration Mode

Cortex supports two architectural patterns for multi-agent/multi-tool systems:

Hive Mode

Multiple tools share ONE memory space. Single write, all benefit.

Collaboration Mode

Each agent has SEPARATE space. Communicate via A2A messaging.

Best for: Personal AI

Cursor + Claude Desktop + custom tools all sharing your context

Best for: Agent Swarms

Autonomous agents with isolated memory and explicit coordination

Hive Mode: Shared Memory Space

Multiple agents share ONE memory space.

hive-mode.ts
// Cursor stores memory
await cortex.memory.remember({
memorySpaceId: "user-123-personal", // Shared space
agentId: "cursor", // Which agent stored it
userMessage: "I prefer dark mode",
agentResponse: "Noted!",
userId: "user-123",
userName: "Alice",
});

// Claude reads from SAME space
const memories = await cortex.memory.search("user-123-personal", "preferences");
// Returns: [{ content: "User prefers dark mode", agentId: "cursor", ... }]

Single Write

One tool stores, all tools benefit

Zero Duplication

One copy of each memory

Consistent State

Everyone sees the same data

Agent Tracking

agentId shows which agent stored what

Perfect For

MCP integrations, personal AI assistants, tool ecosystems, cross-application memory

Collaboration Mode: Separate Memory Spaces

Each agent has SEPARATE memory space, communicates via A2A.

collaboration-mode.ts
// Finance agent stores in its own space
await cortex.memory.remember({
memorySpaceId: "finance-agent-space", // Finance's space
conversationId: "conv-123",
userMessage: "Approve $50k budget",
agentResponse: "Approved",
userId: "user-123",
userName: "CFO",
});

// Send message to HR agent (dual-write to BOTH spaces)
await cortex.a2a.send({
from: "finance-agent",
to: "hr-agent",
message: "Budget approved for hiring",
importance: 85,
metadata: { tags: ["approval", "hiring"] },
});
// Automatically stored in BOTH finance-agent-space AND hr-agent-space

Dual-Write

A2A messages stored in both spaces

Complete Isolation

Each space is independent

No Conflicts

Separate memories can't conflict

GDPR Compliant

Delete one space without affecting others

Perfect For

Autonomous agent swarms, enterprise workflows, multi-tenant systems, regulated industries

Comparison Table

FeatureHive ModeCollaboration Mode
Memory Spaces1 shared spaceN separate spaces
StorageSingle writeDual-write (A2A)
ConsistencyAlways consistentEventually consistent
IsolationNone (by design)Complete
Use CasePersonal AI toolsAutonomous agents
Agent TrackingVia agentIdVia fromAgent/toAgent
ExampleCursor + ClaudeFinance agent + HR agent

Cross-MemorySpace Access (Context Chains)

Even in Collaboration Mode, spaces can grant limited access via context chains:

cross-space-access.ts
// Supervisor creates context and delegates
const context = await cortex.contexts.create({
purpose: "Process refund request",
memorySpaceId: "supervisor-space",
userId: "user-123",
});

// Specialist can access supervisor's context (read-only)
const fullContext = await cortex.contexts.get(context.id, {
includeChain: true,
requestingSpace: "specialist-space", // Cross-space access
});

// specialist-space can read:
// ✅ The context chain (hierarchy)
// ✅ Referenced conversations (only those in context)
// ❌ Supervisor's other memories (isolated)
Security Model

Context chains grant limited read access—only context-referenced data is accessible, preventing memory poisoning. All cross-space reads are logged for audit trails.


Memory

What is Memory?

In Cortex, a memory is a piece of information stored in a memory space for later retrieval.

memory-types.ts
interface MemoryEntry {
id: string; // Unique identifier
memorySpaceId: string; // Which space owns this
agentId?: string; // Which agent stored this
content: string; // The actual information
embedding?: number[]; // Vector for semantic search
metadata: {
importance: number; // 0-100 scale
tags: string[]; // Categorization
[key: string]: any; // Custom metadata
};
createdAt: Date; // When stored
lastAccessed?: Date; // Last retrieval
accessCount: number; // Usage tracking
}

Types of Memories

Information from user interactions:

await cortex.memory.remember({
memorySpaceId: "user-123-personal",
conversationId: "conv-123",
userMessage: "I work in San Francisco",
agentResponse: "That's great to know!",
userId: "user-123",
userName: "Alice",
importance: 60,
tags: ["location", "personal", "user-info"],
});

Facts and system-generated information:

await cortex.vector.store("user-123-personal", {
content: "Product X costs $49.99 with a 20% discount for annual billing",
contentType: "raw",
embedding: await embed("Product X pricing"),
source: { type: "system", timestamp: new Date() },
metadata: {
importance: 85,
tags: ["pricing", "product-x", "business"],
},
});

What was done (tool results, actions):

await cortex.vector.store("support-bot-space", {
content: "Sent password reset email to user@example.com at 2025-10-28 10:30",
contentType: "raw",
embedding: await embed("password reset action"),
source: { type: "tool", timestamp: new Date() },
metadata: {
importance: 90,
tags: ["action", "security", "completed"],
},
});
Agent-to-Agent Communication

For A2A patterns, see Hive Mode vs Collaboration Mode above.

Memory Versioning (Automatic)

Automatic Versioning

When you update a memory, the old version is automatically preserved. No data loss, temporal conflict resolution, and complete audit trails—all automatic.

versioning.ts
// Store user's address
const result = await cortex.memory.remember({
memorySpaceId: "user-123-personal",
conversationId: "conv-456",
userMessage: "My address is 123 Main St, San Francisco",
agentResponse: "I've noted your address",
userId: "user-123",
userName: "Alice",
importance: 80,
});
const memoryId = result.memories[0].id;

// User moves (creates version 2)
await cortex.memory.update("user-123-personal", memoryId, {
content: "User's address is 456 Oak Ave, Seattle",
});

// Both versions are preserved!
const memory = await cortex.memory.get("user-123-personal", memoryId);
console.log(memory.content); // "User's address is 456 Oak Ave, Seattle" (current)
console.log(memory.version); // 2

console.log(memory.previousVersions[0]);
// { version: 1, content: "User's address is 123 Main St, San Francisco", timestamp: ... }

Memory Importance Scale

Cortex uses a granular 0-100 importance scale for precise prioritization:

RangeLevelExamples
90-100CriticalPasswords (100), Hard deadlines (95), Security alerts (95)
70-89HighUser requirements (80), Important decisions (85), Key preferences (75)
40-69MediumGeneral preferences (60), Conversation context (50), Background info (45)
10-39LowCasual observations (30), Minor details (20), Exploratory conversation (25)
0-9TrivialDebug information (5), Temporary data (0)

Embeddings

What are Embeddings?

Embeddings are numerical vectors that represent the semantic meaning of text. They enable semantic search - finding related content by meaning, not just keywords.

"The cat sat on the mat"

[0.234, -0.891, 0.445, ..., 0.123] // 768, 1536, or 3072 dimensions

Embedding-Agnostic Design

Bring Your Own Embeddings

The Cortex SDK does not generate embeddings—you bring your own provider (OpenAI, Cohere, local models, etc.).

embedding-example.ts
// Choose your provider
const embedding = await yourEmbeddingProvider.embed(text);

// Cortex SDK stores and searches
await cortex.vector.store(memorySpaceId, {
content: text,
contentType: "raw",
embedding: embedding, // Your vectors
source: { type: "system", timestamp: new Date() },
metadata: { importance: 50 },
});
import OpenAI from "openai";
const openai = new OpenAI();

const result = await openai.embeddings.create({
model: "text-embedding-3-large", // 3072 dimensions
input: text,
});

const embedding = result.data[0].embedding;

Dimension Tradeoffs

DimensionsSpeedAccuracyCostUse Case
384-768FastGoodLowHigh-volume, real-time
1536MediumBetterMediumGeneral purpose
3072SlowerBestHighWhen accuracy is critical
Recommendation

Default to 3072 dimensions (OpenAI text-embedding-3-large) for best accuracy. Scale down if you need faster search.


User Profiles

User profiles store information about users across all memory spaces and conversations.

user-profile.ts
interface UserProfile {
id: string; // Unique user ID
displayName: string; // How to address them
email?: string; // Contact info
preferences: {
theme?: "light" | "dark";
language?: string;
timezone?: string;
[key: string]: any; // Custom preferences
};
metadata: {
tier?: "free" | "pro" | "enterprise";
signupDate?: Date;
lastSeen?: Date;
[key: string]: any; // Custom metadata
};
}
await cortex.users.update("user-123", {
displayName: "Alice Johnson",
email: "alice@example.com",
preferences: {
theme: "dark",
language: "en",
timezone: "America/Los_Angeles",
},
metadata: {
tier: "pro",
signupDate: new Date(),
company: "Acme Corp",
},
});
Cross-Space Sharing

User profiles are shared across all memory spaces. Update preferences in one space, and they're available everywhere.

Learn more: User Profiles Guide


Infinite Context

The Breakthrough: Never run out of context again.

The Problem

Traditional Approach Limitation

Traditional AI chatbots accumulate conversation history until they hit token limits, causing earlier context to be truncated and forgotten.

traditional-problem.ts
// Traditional approach (accumulation)
const conversation = {
messages: [
{ role: "user", content: "Hi, I prefer TypeScript" },
{ role: "assistant", content: "Noted!" },
// ... 500 more exchanges ...
{ role: "user", content: "What languages do I prefer?" },
{ role: "assistant", content: "???" }, // Message #1 was truncated!
],
};

// Token cost: 500 messages × 50 tokens = 25,000 tokens per request
// Eventually: Exceeds model's context window (128K, 200K, etc.)

The Solution: Retrieval-Based Context

Instead of sending all history, retrieve only relevant memories:

infinite-context.ts
async function respondToUser(userMessage: string, memorySpaceId: string) {
// 1. Retrieve relevant context from ALL past conversations
const relevantContext = await cortex.memory.search(
memorySpaceId,
userMessage,
{
embedding: await embed(userMessage),
limit: 10, // Top 10 most relevant facts/memories
},
);

// 2. LLM call with ONLY relevant context
const response = await llm.complete({
messages: [
{
role: "system",
content: `Relevant Context:\n${relevantContext.map((m) => m.content).join("\n")}`,
},
{ role: "user", content: userMessage }, // Current message only
],
});

// 3. Store exchange (adds to knowledge pool)
await cortex.memory.remember({
memorySpaceId,
conversationId: `ephemeral-${Date.now()}`,
userMessage,
agentResponse: response,
userId: "user-123",
userName: "User",
extractFacts: true, // Auto-extract for future retrieval
});

return response;
}

Key Benefits

Unlimited History

Recall from 1,000,000+ past messages. Token cost stays constant.

99% Token Reduction

From 50,000 tokens to 400 tokens. $1.50 → $0.012 per request.

Works with Any Model

Smaller, cheaper models can have 'infinite' memory.

Perfect Recall

Semantic search finds relevant info from years ago.

Unified Retrieval with recall()

The recall() method orchestrates retrieval across all memory layers:

recall() Pipeline
1. Vector Search (Layer 2)

Semantic search across embedded memories • Uses pre-computed or on-the-fly embeddings

2. Facts Search (Layer 3)

Direct structured facts query • 60-90% token savings vs raw content

3. Graph Expansion (Optional)

Discovers related context via entity connections • Multi-hop traversal

4. Merge, Dedupe & Rank

Multi-signal scoring algorithm • Ready-to-inject LLM context string

const result = await cortex.memory.recall({
memorySpaceId: "user-123-personal",
query: "What are user's preferences?",
limit: 10,
});

// Unified results from all layers
console.log(result.context); // Ready for LLM injection
console.log(result.sources.vector); // Vector memories
console.log(result.sources.facts); // Structured facts
console.log(result.sources.graph); // Graph-expanded entities
Token Efficiency

Facts-first retrieval: ~400 tokens vs 20,000+ accumulated. 99% reduction with perfect recall.

Learn more: Infinite Context Architecture


Context Chains

Context chains enable hierarchical context sharing in multi-agent systems and enable cross-memorySpace access with security controls.

Think of Context Chains Like...

A management hierarchy where supervisors see their team's work, teams share knowledge within their context, and specialists can access supervisor context (limited). Everyone can access relevant historical context.

Context Chain Visualization
Root Context (Supervisor Space)

Handle customer refund request

Child Context (Finance Space)

Process $500 refund

Child Context (Customer Relations)

Send apology email

Hierarchical delegation across memory spaces
context-chains.ts
// Create parent context
const context = await cortex.contexts.create({
purpose: "Handle customer refund request",
memorySpaceId: "supervisor-space",
userId: "user-123",
metadata: { ticketId: "TICKET-456", priority: "high" },
});

// Supervisor delegates to finance agent (different memory space)
const financeContext = await cortex.contexts.create({
purpose: "Process $500 refund",
memorySpaceId: "finance-agent-space", // Different space!
parentId: context.id, // Link to parent
metadata: { amount: 500, reason: "defective product" },
});

// Finance agent accesses supervisor context (different space)
const fullContext = await cortex.contexts.get(financeContext.id, {
includeChain: true,
requestingSpace: "finance-agent-space",
});

Use Cases

Hierarchical Multi-Agent

Supervisor agents delegate to workers with shared context

Task Decomposition

Break complex tasks into subtasks while maintaining context

Audit Trails

Track full history of how tasks were handled across spaces

Secure Knowledge Sharing

Teams share context without exposing unrelated information


Analytics

Enhanced Analytics Coming Soon

Advanced analytics dashboards with cortex.analytics.* APIs are planned for Cortex Cloud. The SDK provides basic memory space statistics today via cortex.memorySpaces.getStats().

analytics.ts
// Get memory space statistics
const stats = await cortex.memorySpaces.getStats("user-123-personal");

console.log(stats);
// {
// totalMemories: 15432,
// agents: ['cursor', 'claude', 'custom-bot'],
// ...
// }
const memory = await cortex.memory.get("user-123-personal", "mem_123");

console.log({
accessCount: memory.accessCount,
lastAccessed: memory.lastAccessed,
createdAt: memory.createdAt,
});

Learn more: Access Analytics Guide (Planned)


Data Flow

Complete Memory Lifecycle

Memory Lifecycle
Your Application

Call cortex.memory.remember() or cortex.memory.recall()

Cortex SDK

Memory orchestration • Fact extraction • Multi-layer coordination • Real-time sync

Cortex Storage

ACID transactions • Vector indexing • Memory space isolation • Powered by Convex

Memory Available

Instant semantic search • Cross-layer retrieval • Real-time updates

How Cortex orchestrates memory from input to retrieval
Cortex Does the Heavy Lifting

You call remember() or recall()—Cortex handles orchestration, fact extraction, vector indexing, and multi-layer retrieval automatically.


Graph Database Integration

Cortex supports optional graph database integration for advanced knowledge discovery, multi-hop queries, and relationship-based reasoning.

Graph-Enhanced Architecture
Convex (Source of Truth)

ACID transactions • Vector search • Real-time sync

Graph Database

Neo4j or Memgraph • Multi-hop queries • Relationship traversal

Knowledge Discovery

Entity networks • Provenance tracking • Context enrichment

Convex + Graph for complete knowledge management

When to Use Graph

Recommended For

Deep context chains (5+ levels), knowledge graphs, multi-hop reasoning, provenance needs

Not Needed For

Simple conversational memory, shallow context (1-2 levels), pure vector search

Quick Start

graph-setup.ts
import { Cortex } from "@cortexmemory/sdk";
import { CypherGraphAdapter, initializeGraphSchema } from "@cortexmemory/sdk/graph";

// 1. Setup graph adapter
const graphAdapter = new CypherGraphAdapter();
await graphAdapter.connect({
uri: "bolt://localhost:7687",
username: "neo4j",
password: process.env.NEO4J_PASSWORD!,
});

// 2. Initialize schema (one-time)
await initializeGraphSchema(graphAdapter);

// 3. Initialize Cortex with graph
const cortex = new Cortex({
convexUrl: process.env.CONVEX_URL!,
graph: {
adapter: graphAdapter,
autoSync: true, // Auto-sync to graph
},
});

// 4. Use Cortex normally - graph syncs automatically!

Graph-Powered Queries

// Who works at the same company as Alice?
const coworkers = await graphAdapter.query(`
MATCH (alice:Entity {name: 'Alice'})-[:WORKS_AT]->(company:Entity)
MATCH (company)<-[:WORKS_AT]-(coworker:Entity)
WHERE coworker.name <> 'Alice'
RETURN DISTINCT coworker.name as name
`);
// ["Bob", "Carol", "Dave"]

Performance Characteristics

Query TypeWithout GraphWith Graph
1-hop lookup3-10ms10-25ms
3-hop traversal10-50ms (limited)4-10ms
7-hop traversalNot feasible4-15ms
Pattern matchingNot feasible10-100ms
Convex as Source of Truth

Graph is for discovery and traversal. Always write to Convex first, then sync to graph. Cortex works perfectly without graph—add it when you need multi-hop reasoning or knowledge graphs.

Learn more: Graph Integration GuideGraph Operations API


Next Steps


Questions? Ask in Discussions.