Fact Extraction
Instead of feeding entire conversation histories to your LLM, extract salient facts and store them efficiently. This enables:
- Infinite context - Recall from millions of messages
- Cost reduction - 60-90% fewer tokens per request
- Better retrieval - Structured facts are more searchable
v0.30.0: Facts now support semantic search via embeddings. Configure embedding once at SDK init (or set CORTEX_EMBEDDING=true) and embeddings are auto-generated for all remember() and recall() calls - no manual embedding code needed!
v0.30.1: LLM extraction now returns enriched entities with semantic types (person, organization, place, product, concept) and relation triples that sync to the graph as typed edges (e.g., WORKS_AT, LOCATED_IN). A new EXTRACTED_WITH edge links Facts to their source Memory for bidirectional traceability.
Quick Example
Raw Conversation (402 tokens):
User: "I moved from Paris to London last week. I'm working at Acme Corp
as a senior engineer. My commute is 30 minutes on the tube."
Extracted Facts (45 tokens):
1. User moved from Paris to London (last week)
2. User works at Acme Corp as Senior Engineer
3. User's commute: 30 minutes via tube
Storage: 89% reduction
Extraction Modes
Configure an LLM once, facts extracted automatically:
import { Cortex } from '@cortex-platform/sdk';
import { openai } from '@ai-sdk/openai';
// Configure Cortex with LLM client
const cortex = new Cortex({
convexUrl: process.env.CONVEX_URL!,
llm: openai('gpt-4'), // Enables automatic fact extraction
});
// Facts extracted automatically - no callback needed!
await cortex.memory.remember({
memorySpaceId: 'user-123-space',
conversationId: 'conv-123',
userMessage: 'My favorite color is blue',
agentResponse: 'Got it!',
userId: 'user-123',
userName: 'Alice',
// No extractFacts needed - LLM extracts automatically!
});
// Result: Facts extracted + belief revision applied + stored
Configure LLM once, get automatic extraction + deduplication + belief revision for all remember() calls. This is the batteries-included approach.
Skip extraction for specific calls:
// Skip fact extraction for this conversation
await cortex.memory.remember({
// ... params
skipLayers: ['facts'], // Don't extract facts this time
});
Override automatic extraction with your own logic:
await cortex.memory.remember({
memorySpaceId: 'user-123-space',
conversationId: 'conv-123',
userMessage: 'My favorite color is blue',
agentResponse: 'Got it!',
userId: 'user-123',
userName: 'Alice',
// Your extraction logic - overrides LLM config
extractFacts: async (userMsg, agentResp) => [{
fact: 'User prefers blue color',
factType: 'preference',
subject: 'user-123',
predicate: 'favorite_color',
object: 'blue',
confidence: 95,
}],
});
Provide extractFacts callback to override automatic LLM extraction. Use your own LLM, prompts, regex rules, or any combination.
Store facts directly without conversations:
await cortex.facts.store({
memorySpaceId: 'user-123-space',
userId: 'user-123',
fact: 'User prefers blue color',
factType: 'preference',
subject: 'user-123',
predicate: 'favorite_color',
object: 'blue',
confidence: 95,
sourceType: 'manual',
});
Batteries-Included: Configure embedding once at SDK init and all remember() and recall() calls automatically get embeddings:
import { Cortex } from '@cortexmemory/sdk';
// Configure embedding once - batteries included!
const cortex = new Cortex({
convexUrl: process.env.CONVEX_URL!,
llm: { provider: 'openai', apiKey: process.env.OPENAI_API_KEY },
embedding: { provider: 'openai', apiKey: process.env.OPENAI_API_KEY },
});
// Embeddings auto-generated for facts - no callback needed!
await cortex.memory.remember({
memorySpaceId: 'user-123-space',
conversationId: 'conv-123',
userMessage: 'My favorite color is blue',
agentResponse: 'Got it!',
userId: 'user-123',
userName: 'Alice',
// Facts extracted + embeddings generated automatically!
});
// Embeddings auto-generated for queries too!
const result = await cortex.memory.recall({
memorySpaceId: 'user-123-space',
query: 'What colors does the user like?',
// No manual embedding needed - auto-generated from query!
});
// Finds "User prefers blue color" even without keyword match
Or use environment variables for zero-config:
export OPENAI_API_KEY=sk-...
export CORTEX_EMBEDDING=true
export CORTEX_FACT_EXTRACTION=true
When embedding is configured at SDK init, embeddings are automatically generated for each extracted fact during remember() and for queries during recall(). This enables semantic search with zero per-call embedding code.
Fact Schema
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
fact | string | Yes | — | The extracted fact statement |
factType | string | Yes | — | 'preference' | 'identity' | 'knowledge' | 'relationship' | 'event' |
subject | string | No | — | Entity the fact is about (e.g., 'user-123') |
predicate | string | No | — | Relationship type (e.g., 'favorite_color') |
object | string | No | — | Related value (e.g., 'blue') |
confidence | number | Yes | — | Confidence score (0-100) |
tags | string[] | No | — | Categorization tags |
aliases | string[] | No | — | Alternative phrasings for search |
Belief Revision
When you extract a fact that conflicts with an existing fact, Cortex handles it automatically:
- ADD: New fact (no conflict)
- SUPERSEDE: Old fact marked as superseded, new fact is current
- MERGE: Compatible facts combined
- IGNORE: Duplicate fact skipped
// Old fact: "User prefers blue"
// New conversation: "Actually, I like purple now"
await cortex.memory.remember({
memorySpaceId: 'user-123-space',
conversationId: 'conv-456',
userMessage: 'Actually, I like purple now',
agentResponse: 'Updated!',
userId: 'user-123',
userName: 'Alice',
extractFacts: async () => [{
fact: 'User prefers purple',
factType: 'preference',
subject: 'user-123',
predicate: 'favorite_color',
object: 'purple',
confidence: 95,
}],
beliefRevision: true, // Automatically supersedes old "blue" fact
});
Preview Conflicts
const conflicts = await cortex.facts.checkConflicts({
memorySpaceId: 'user-123-space',
fact: 'User prefers purple',
subject: 'user-123',
predicate: 'favorite_color',
});
if (conflicts.hasConflicts) {
console.log('Would supersede:', conflicts.conflictingFacts);
}
Querying Facts
// Search facts directly
const facts = await cortex.facts.search(spaceId, 'color preference');
// Or semantic search via memory API (embedding auto-generated!)
const memories = await cortex.memory.search(spaceId, 'what colors?', {
// No manual embedding needed - auto-generated from query!
contentType: 'fact',
limit: 5,
});
// recall() automatically includes facts (embedding auto-generated!)
const result = await cortex.memory.recall({
memorySpaceId: spaceId,
query: 'What colors does the user like?',
// No manual embedding needed - auto-generated from query!
sources: { facts: true },
});
result.items
.filter(i => i.source === 'facts')
.forEach(f => console.log(f.fact.fact));
// Get only preferences
const prefs = await cortex.facts.list({
memorySpaceId: spaceId,
factType: 'preference',
isSuperseded: false, // Current beliefs only
});
// Get only identity facts
const identity = await cortex.facts.list({
memorySpaceId: spaceId,
factType: 'identity',
});
Deduplication
memory.remember() automatically deduplicates facts using semantic matching. Same fact stated differently won't create duplicates.
| Strategy | How it Works | Speed | Accuracy |
|---|---|---|---|
semantic | Embedding similarity (default) | Slower | Highest |
structural | Subject + predicate + object match | Fast | Medium |
exact | Normalized text match | Fastest | Low |
// Use structural deduplication (faster)
await cortex.memory.remember({
// ... params
factDeduplication: 'structural',
});
// Disable deduplication
await cortex.memory.remember({
// ... params
factDeduplication: false,
});
Fact History
// Get change history for a specific fact
const changes = await cortex.facts.history('fact-123');
changes.forEach(event => {
console.log(`${event.action} at ${new Date(event.timestamp).toISOString()}`);
console.log(` Reason: ${event.reason}`);
});
// Get the supersession chain (evolution over time)
const chain = await cortex.facts.getSupersessionChain('fact-123');
// Returns: [oldest] -> [older] -> [current]
// Get all versions of a fact
const versions = await cortex.facts.getHistory(spaceId, 'fact-123');
versions.forEach(v => console.log(`v${v.version}: ${v.fact}`));
LLM Model Benchmarks
All models tested under identical conditions using Cortex's standard extraction prompt with structured JSON output. Metrics include extracted facts, enriched entities, relation triples, and end-to-end latency. Results may vary based on conversation complexity and model updates.
Cortex supports multiple OpenAI models for fact extraction. Here are the performance characteristics:
| Model | Facts | Entities | Relations | Avg Latency |
|---|---|---|---|---|
| gpt-4o-2024-11-20 | 13 | 15 | 6 | 11.6s |
| gpt-5-mini | 13 | 23 | 17 | 56.9s |
| gpt-5.1 | 12 | 15 | 13 | 18.6s |
| gpt-5.2 | 12 | 18 | 14 | 20.3s |
| gpt-5-mini-2025-08-07 | 12 | 26 | 19 | 50.1s |
| gpt-5-nano | 12 | 22 | 14 | 100.5s |
| gpt-4o | 11 | 10 | 1 | 13.0s |
| gpt-4o-mini-2024-07-18 | 11 | 13 | 0 | 15.8s |
| gpt-4o-mini | 11 | 15 | 0 | 16.0s |
| gpt-5 | 9 | 18 | 11 | 82.7s |
OpenAI model comparison for fact extraction (January 2026)
Recommendations
Best Balance
gpt-4o-2024-11-20 offers top-tier quality (13 facts) with fastest latency (11.6s)
Best Quality
gpt-5-mini extracts more entities/relations but at 5x latency cost
Budget Option
gpt-4o-mini provides good extraction at lower cost, but extracts fewer relations
Production Tip
Use faster models for high-volume workloads; premium models for complex extraction
GPT-5 Series
gpt-5.1 and gpt-5.2 offer good balance between quality and speed for mid-tier workloads
Best Practices
// Discard low-confidence facts
extractFacts: async (user, agent) => {
const facts = await extractWithLLM(user, agent);
return facts.filter(f => f.confidence >= 70);
},
// Good: Enables structural deduplication
{ fact: 'User prefers dark mode', subject: 'user-123', predicate: 'prefers', object: 'dark-mode' }
// Less good: No structure, relies on semantic dedup only
{ fact: 'User prefers dark mode' }
// Facts automatically get conversationRef when using remember()
// This enables tracing facts back to source for audit
Token Savings Analysis
async function analyzeTokenSavings(conversationId: string) {
const conv = await cortex.conversations.get(conversationId);
const rawTokens = estimateTokens(conv.messages.map(m => m.text).join('\n'));
const facts = await cortex.facts.list({
'conversationRef.conversationId': conversationId,
});
const factTokens = estimateTokens(facts.map(f => f.fact).join('\n'));
const savings = ((rawTokens - factTokens) / rawTokens) * 100;
return { rawTokens, factTokens, savingsPercent: savings.toFixed(1) };
}
// Example: { rawTokens: 1250, factTokens: 125, savingsPercent: '90.0' }