Skip to main content

Search Strategy

Info
Last Updated: 2026-01-08

Multi-strategy search implementation: semantic, keyword, facts, temporal, and hybrid approaches.

Overview

Cortex provides four primary search strategies that can be used independently or combined:

  1. Semantic Search - Vector similarity (meaning-based)
  2. Facts Search - Structured knowledge from Layer 3 (most efficient)
  3. Keyword Search - Full-text matching (exact words)
  4. Temporal Search - Recent/time-based retrieval

The recall() method (v0.24.0+) automatically combines strategies for optimal results.

Search Strategy Selection
recall() call
Facts Search

Layer 3 - Most token-efficient

Vector Search

Layer 2 - Semantic similarity

Recent Search

Layer 2 - Time-based retrieval

Merge + Deduplicate + Rank

Multi-signal scoring algorithm

recall() automatically combines multiple search strategies for optimal results

How It Works

Uses vector embeddings and cosine similarity:

// User query
const query = "What is the user's preferred communication method?";

// Generate query embedding
const queryEmbedding = await embed(query);

// SDK vector search
const results = await cortex.vector.search(memorySpaceId, query, {
embedding: queryEmbedding,
limit: 20,
});

// Returns:
// 1. "User prefers email over phone" (score: 0.92)
// 2. "Email is best way to reach user" (score: 0.88)
// 3. "User doesn't like phone calls" (score: 0.85)
// Even though exact words don't match!

Strengths:

  • Finds meaning, not just words
  • Handles synonyms ("car" matches "vehicle")
  • Understands context
  • Works across languages (with multilingual models)

Weaknesses:

  • Requires embeddings (cost + latency)
  • Can miss exact matches
  • Less predictable than keywords

Best for:

  • Natural language queries
  • Conceptual matching
  • Cross-language search
  • Fuzzy matching

SDK Usage

// Using the Cortex SDK
const query = "What is the user's preferred communication method?";
const embedding = await embed(query);

const results = await cortex.vector.search(memorySpaceId, query, {
embedding,
userId: "user-123",
limit: 20,
});

// Results include score field (0-1 similarity)
results.forEach((result) => {
console.log(`${result.content} (score: ${result.score})`);
});
Tenant Isolation

All searches are automatically filtered by tenantId for multi-tenant isolation. This is handled at the database level via filterFields configuration.

Conceptual Example: Internal Implementation

Conceptual Example

The following shows how semantic search works internally. Use cortex.vector.search() or cortex.memory.recall() in your application code.

// Internal Convex query implementation (conceptual)
export const semanticSearch = query({
args: {
memorySpaceId: v.string(),
embedding: v.array(v.float64()),
filters: v.any(),
},
handler: async (ctx, args) => {
let q = ctx.db.query("memories").withIndex("by_embedding", (q) => {
let search = q
.similar("embedding", args.embedding, args.filters.limit || 20)
.eq("memorySpaceId", args.memorySpaceId);

// Tenant isolation (pre-filtered via filterFields)
if (args.filters.tenantId) {
search = search.eq("tenantId", args.filters.tenantId);
}

return search;
});

// Apply additional filters
if (args.filters.userId) {
q = q.filter((q) => q.eq(q.field("userId"), args.filters.userId));
}

if (args.filters.participantId) {
q = q.filter((q) =>
q.eq(q.field("participantId"), args.filters.participantId),
);
}

const results = await q.collect();

// Calculate similarity scores
return results.map((memory) => ({
...memory,
score: cosineSimilarity(memory.embedding, args.embedding),
strategy: "semantic",
}));
},
});

Strategy 2: Facts Search (Layer 3)

How It Works

Searches structured facts from Layer 3 - the most token-efficient strategy:

// User query
const query = "What is the user's communication preference?";

// SDK facts search
const facts = await cortex.facts.search(memorySpaceId, query, {
limit: 20,
});

// Returns:
// 1. "User prefers email for communication" (confidence: 95%)
// 2. "User dislikes phone calls" (confidence: 88%)
// 3. "User checks email twice daily" (confidence: 82%)
// Each fact is ~8 tokens vs 50+ tokens for raw conversation

Strengths:

  • Most token-efficient (60-90% savings)
  • Structured knowledge (subject-predicate-object)
  • High confidence (LLM-extracted)
  • Deduplication via Belief Revision System
  • Enables infinite context capability

Weaknesses:

  • Requires LLM for extraction (initial cost)
  • May miss nuance from raw conversation
  • Confidence scoring needed

Best for:

  • Long-running conversations (infinite context)
  • Entity-centric queries (subject-based)
  • Knowledge retrieval across sessions
  • Token-constrained environments

SDK Usage

// Using the Cortex SDK
const facts = await cortex.facts.search(
memorySpaceId,
"What is the user's communication preference?",
{
userId: "user-123",
factType: "preference",
minConfidence: 70,
limit: 20,
}
);

// Results include confidence scores
facts.forEach((fact) => {
console.log(`${fact.fact} (confidence: ${fact.confidence}%)`);
});

Conceptual Example: Internal Implementation

Conceptual Example

The following shows how facts search works internally. Use cortex.facts.search() or cortex.memory.recall() in your application code.

// Internal Convex query implementation (conceptual)
export const factsSearch = query({
args: {
memorySpaceId: v.string(),
query: v.string(),
filters: v.any(),
},
handler: async (ctx, args) => {
let results = await ctx.db
.query("facts")
.withSearchIndex("by_content", (q) =>
q.search("fact", args.query).eq("memorySpaceId", args.memorySpaceId),
)
.take(args.filters.limit || 20);

// Apply filters
if (args.filters.userId) {
results = results.filter((f) => f.userId === args.filters.userId);
}

if (args.filters.factType) {
results = results.filter((f) => f.factType === args.filters.factType);
}

// Filter by confidence
if (args.filters.minConfidence) {
results = results.filter(
(f) => f.confidence >= args.filters.minConfidence,
);
}

return results.map((fact) => ({
...fact,
score: fact.confidence / 100, // Confidence as score
strategy: "facts",
}));
},
});

Facts by Subject (Entity-Centric)

Conceptual Example

The following shows entity-centric fact retrieval. Use cortex.facts.search() with subject filters or cortex.memory.recall() for orchestrated search.

// Conceptual: Query facts about a specific entity
// In practice, use cortex.facts.search() with subject filters
const userFacts = await cortex.facts.search(memorySpaceId, "*", {
subject: "user-123",
limit: 100,
});

// Returns: All preferences, identities, relationships for user-123

How It Works

Uses Convex full-text search indexes:

// User query
const query = "password Blue";

// SDK vector search (uses keyword matching when no embedding provided)
const results = await cortex.vector.search(memorySpaceId, query, {
limit: 20,
});

// Returns:
// 1. "The password is Blue" (exact match)
// 2. "User's password: Blue123" (contains words)
// 3. "Blue is the password color" (contains both)
// Only returns if words actually appear

Strengths:

  • Exact word matching
  • Fast (no embedding needed)
  • Predictable results
  • No API costs

Weaknesses:

  • Misses synonyms
  • Order matters somewhat
  • Doesn't understand meaning

Best for:

  • Exact phrase lookup
  • Technical terms
  • IDs, codes, names
  • When embeddings unavailable

SDK Usage

// Using the Cortex SDK
const results = await cortex.vector.search(memorySpaceId, "password Blue", {
userId: "user-123",
limit: 20,
});

// Results are relevance-ranked by Convex search index
results.forEach((result) => {
console.log(`${result.content} (score: ${result.score})`);
});

Conceptual Example: Internal Implementation

Conceptual Example

The following shows how keyword search works internally. Use cortex.vector.search() or cortex.memory.recall() in your application code.

// Internal Convex query implementation (conceptual)
export const keywordSearch = query({
args: {
memorySpaceId: v.string(),
keywords: v.string(),
filters: v.any(),
},
handler: async (ctx, args) => {
let results = await ctx.db
.query("memories")
.withSearchIndex("by_content", (q) =>
q
.search("content", args.keywords)
.eq("memorySpaceId", args.memorySpaceId),
)
.take(args.filters.limit || 20);

// Convex returns relevance-ranked results
return results.map((memory, index) => ({
...memory,
score: 1 - index / results.length, // Approximate relevance
strategy: "keyword",
}));
},
});

How It Works

Prioritizes recent or time-relevant memories:

// Get recent memories using recall()
const result = await cortex.memory.recall({
memorySpaceId,
query: "*", // Empty query for recent-only
sources: {
vector: { limit: 0 }, // Disable vector search
facts: { limit: 0 }, // Disable facts search
},
limit: 20,
});

// Returns most recent memories, regardless of content
// Results are automatically sorted by recency

Strengths:

  • Always relevant (recent = relevant)
  • Very fast (simple index)
  • No embeddings needed
  • Great for conversations

Weaknesses:

  • Ignores content relevance
  • May miss important old information

Best for:

  • Conversation context (last N messages)
  • Recent user interactions
  • Time-sensitive information
  • When recency matters more than content

SDK Usage

// Using cortex.memory.recall() for recent memories
const result = await cortex.memory.recall({
memorySpaceId: memorySpaceId,
query: "*", // Empty query for recent-only
sources: {
vector: { limit: 0 }, // Disable vector search
facts: { limit: 0 }, // Disable facts search
},
limit: 20,
});

// Results sorted by recency
result.items.forEach((item) => {
console.log(`${item.content} (created: ${new Date(item.memory?.createdAt || 0)})`);
});

Conceptual Example: Internal Implementation

Conceptual Example

The following shows how temporal search works internally. Use cortex.memory.recall() in your application code.

// Internal Convex query implementation (conceptual)
export const recentSearch = query({
args: {
memorySpaceId: v.string(),
filters: v.any(),
},
handler: async (ctx, args) => {
let q = ctx.db
.query("memories")
.withIndex("by_memorySpace_created", (q) =>
q.eq("memorySpaceId", args.memorySpaceId),
)
.order("desc");

// Filter by importance (flattened field)
if (args.filters.minImportance) {
q = q.filter((q) =>
q.gte(q.field("importance"), args.filters.minImportance),
);
}

// Filter by user
if (args.filters.userId) {
q = q.filter((q) => q.eq(q.field("userId"), args.filters.userId));
}

// Filter by participant (Hive Mode)
if (args.filters.participantId) {
q = q.filter((q) =>
q.eq(q.field("participantId"), args.filters.participantId),
);
}

const results = await q.take(args.filters.limit || 20);

return results.map((memory, index) => ({
...memory,
score: 1 - index / results.length, // Recency score
strategy: "recent",
}));
},
});

Strategy 5: Orchestrated Search (recall())

The cortex.memory.recall() method automatically combines multiple search strategies:

// Automatic multi-strategy search
const result = await cortex.memory.recall({
memorySpaceId: memorySpaceId,
query: "What are the user's preferences?",
embedding: await embed("What are the user's preferences?"),
userId: "user-123",
limit: 10,
});

// Results include items from multiple sources
console.log(`Found ${result.items.length} items`);
console.log(`- ${result.sources.vector?.count || 0} from vector search`);
console.log(`- ${result.sources.facts?.count || 0} from facts search`);
console.log(`- ${result.sources.graph?.count || 0} from graph expansion`);

// Items are ranked using multi-signal scoring
result.items.forEach((item) => {
console.log(`${item.content} (score: ${item.score})`);
});

What recall() does automatically:

  1. Searches facts (Layer 3) for structured knowledge
  2. Searches vectors (Layer 2) for semantic matches
  3. Expands via graph connections for related context
  4. Merges and deduplicates results
  5. Ranks using multi-signal scoring algorithm
  6. Returns LLM-ready context string

Conceptual Example: Internal Hybrid Logic

Conceptual Example

The following shows how recall() combines strategies internally. Use cortex.memory.recall() directly in your application code.

// Conceptual: How recall() combines strategies internally
// In practice, use cortex.memory.recall() which handles this automatically

// 1. Search facts (most token-efficient)
const facts = await cortex.facts.search(memorySpaceId, query, {
minConfidence: 70,
limit: 20,
});

// 2. Search vectors (semantic similarity)
const vectors = await cortex.vector.search(memorySpaceId, query, {
embedding,
limit: 20,
});

// 3. recall() merges, deduplicates, and ranks automatically
const result = await cortex.memory.recall({
memorySpaceId,
query,
embedding,
});

Ranking and Scoring Algorithm

The recall() method uses a multi-signal scoring algorithm with the following weights (from resultProcessor.ts):

SignalWeightDescription
Semantic similarity0.35Vector similarity score (0-1)
Confidence0.20Fact confidence (0-100 → 0-1)
Importance0.15Memory importance (0-100 → 0-1)
Recency0.15Time decay (exponential, 30-day half-life)
Graph connectivity0.15Number of connected entities (logarithmic)

Additional boosts:

  • Highly connected items (>3 entities): ×1.2 multiplier
  • User messages: ×1.1 multiplier (more likely to contain preferences)

Conceptual Example: Scoring Implementation

Conceptual Example

The following shows how scoring works internally. cortex.memory.recall() applies this automatically.

// Conceptual: Multi-signal scoring algorithm (from resultProcessor.ts)
const RANKING_WEIGHTS = {
semantic: 0.35, // Vector similarity
confidence: 0.20, // Fact confidence
importance: 0.15, // Memory importance
recency: 0.15, // Time decay
graphConnectivity: 0.15, // Graph connections
};

function calculateScore(item: RecallItem): number {
let score = 0;

// Base semantic score
score += (item.score || 0.5) * RANKING_WEIGHTS.semantic;

// Confidence (facts) or default (memories)
const confidence = item.type === "fact"
? item.fact!.confidence / 100
: 0.8;
score += confidence * RANKING_WEIGHTS.confidence;

// Importance
const importance = item.type === "memory"
? item.memory!.importance / 100
: item.fact!.confidence / 100; // Use confidence as proxy for facts
score += importance * RANKING_WEIGHTS.importance;

// Recency (exponential decay, 30-day half-life)
const age = Date.now() - (item.memory?.createdAt || item.fact?.createdAt || Date.now());
const recencyScore = Math.pow(2, -age / (30 * 24 * 60 * 60 * 1000));
score += recencyScore * RANKING_WEIGHTS.recency;

// Graph connectivity (logarithmic scale)
const connectedEntities = item.graphContext?.connectedEntities || [];
const connectivityScore = Math.min(1.0, Math.log2(connectedEntities.length + 1) / Math.log2(11));
score += connectivityScore * RANKING_WEIGHTS.graphConnectivity;

// Boosts
if (connectedEntities.length > 3) {
score *= 1.2; // Highly connected boost
}
if (item.type === "memory" && item.memory?.messageRole === "user") {
score *= 1.1; // User message boost
}

return Math.min(1.0, Math.max(0, score));
}

Filter Application

Pre-Filtering (Convex Index)

Applied BEFORE search for efficiency. Tenant isolation is automatically handled:

// Pre-filtered (fast) - Conceptual example
await ctx.db
.query("memories")
.withIndex(
"by_embedding",
(q) =>
q
.similar("embedding", vector, 20)
.eq("memorySpaceId", memorySpaceId) // ← Pre-filter
.eq("tenantId", tenantId) // ← Automatic tenant isolation
.eq("userId", userId), // ← Pre-filter
)
.collect();

// Only searches vectors for this memorySpace+tenant+user
// filterFields: ["memorySpaceId", "tenantId", "userId"] enables this
Tenant Isolation

All searches automatically filter by tenantId at the database level. This ensures complete data isolation between tenants. The SDK handles this transparently - you don't need to pass tenantId explicitly.

Post-Filtering (Application Logic)

Applied AFTER search for complex logic that can't be expressed in indexes:

// Post-filter (after vector search) - Conceptual example
const results = await ctx.db
.query("memories")
.withIndex("by_embedding", (q) =>
q
.similar("embedding", vector, 50) // Get more results
.eq("memorySpaceId", memorySpaceId),
)
.collect();

// Complex filters in code
const filtered = results.filter((memory) => {
// Importance range
if (memory.importance < 50 || memory.importance > 90) {
return false;
}

// Tag matching (AND logic)
if (!memory.metadata?.tags?.includes("password")) {
return false;
}

// Date range
if (memory.createdAt < someDate) {
return false;
}

// Custom logic
if (memory.metadata?.customField !== "value") {
return false;
}

return true;
});

return filtered.slice(0, 10);

Rule: Use pre-filtering when possible, post-filtering for complex logic. The SDK automatically applies tenant isolation via pre-filtering.


Universal Filters Implementation

Filter Processing Pipeline

function applyUniversalFilters(
results: MemoryEntry[],
filters: UniversalFilters,
): MemoryEntry[] {
let filtered = results;

// userId (usually pre-filtered)
if (filters.userId) {
filtered = filtered.filter((m) => m.userId === filters.userId);
}

// Tags (any or all)
if (filters.tags && filters.tags.length > 0) {
if (filters.tagMatch === "all") {
// Must have ALL tags
filtered = filtered.filter((m) =>
filters.tags.every((tag) => m.metadata.tags.includes(tag)),
);
} else {
// Must have ANY tag (default)
filtered = filtered.filter((m) =>
filters.tags.some((tag) => m.metadata.tags.includes(tag)),
);
}
}

// Importance range
if (filters.importance) {
if (typeof filters.importance === "number") {
filtered = filtered.filter(
(m) => m.metadata.importance === filters.importance,
);
} else {
// RangeQuery
if (filters.importance.$gte !== undefined) {
filtered = filtered.filter(
(m) => m.metadata.importance >= filters.importance.$gte,
);
}
if (filters.importance.$lte !== undefined) {
filtered = filtered.filter(
(m) => m.metadata.importance <= filters.importance.$lte,
);
}
}
}

// Date ranges
if (filters.createdAfter) {
filtered = filtered.filter(
(m) => m.createdAt >= filters.createdAfter.getTime(),
);
}

if (filters.createdBefore) {
filtered = filtered.filter(
(m) => m.createdAt <= filters.createdBefore.getTime(),
);
}

// Source type
if (filters["source.type"]) {
filtered = filtered.filter((m) => m.source.type === filters["source.type"]);
}

// Access patterns
if (filters.accessCount) {
filtered = applyRangeFilter(filtered, "accessCount", filters.accessCount);
}

return filtered;
}

Ranking and Scoring

Base Similarity Score

function cosineSimilarity(a: number[], b: number[]): number {
let dotProduct = 0;
let normA = 0;
let normB = 0;

for (let i = 0; i < a.length; i++) {
dotProduct += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}

return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
}

// Returns 0-1 (1 = identical, 0 = opposite)

Multi-Factor Ranking

Conceptual Example

The following shows simplified scoring logic. The actual implementation in resultProcessor.ts uses the weights shown in the Ranking and Scoring Algorithm section above.

// Conceptual: Simplified scoring (actual weights differ)
function calculateFinalScore(
memory: MemoryEntry,
baseScore: number,
options: SearchOptions,
): number {
let score = baseScore;

// Factor 1: Importance (weight: 0.15 in actual implementation)
if (options.boostImportance) {
const importanceFactor = memory.importance / 100;
score = score * 0.85 + importanceFactor * 0.15;
}

// Factor 2: Recency (weight: 0.15, exponential decay)
if (options.boostRecent) {
const age = Date.now() - memory.createdAt;
const halfLife = 30 * 24 * 60 * 60 * 1000; // 30 days
const recencyFactor = Math.pow(2, -age / halfLife);
score = score * 0.85 + recencyFactor * 0.15;
}

// Factor 3: Graph connectivity (weight: 0.15 in actual implementation)
// This is handled automatically by recall() when graph expansion is enabled

return score;
}

Strategy Selection Logic

Using recall() for Automatic Strategy Selection

The cortex.memory.recall() method automatically selects and combines strategies:

// Automatic strategy selection via recall()
const result = await cortex.memory.recall({
memorySpaceId: memorySpaceId,
query: "user preferences",
embedding: await embed("user preferences"), // Optional: enables semantic search
userId: "user-123",
limit: 10,
});

// recall() automatically:
// 1. Searches facts (always, most efficient)
// 2. Searches vectors (if embedding provided)
// 3. Expands via graph (if enabled)
// 4. Merges, deduplicates, and ranks results

Conceptual Example: Strategy Selection Logic

Conceptual Example

The following shows how strategy selection works internally. Use cortex.memory.recall() directly.

// Conceptual: How recall() selects strategies internally
// In practice, use cortex.memory.recall() which handles this automatically

async function recall(memorySpaceId: string, query: string, options: any) {
const results: RecallItem[] = [];

// Always search facts (most token-efficient)
const facts = await cortex.facts.search(memorySpaceId, query, options);
results.push(...facts.map(f => factToRecallItem(f, "facts")));

// Search vectors if embedding provided
if (options.embedding) {
const vectors = await cortex.vector.search(memorySpaceId, query, options);
results.push(...vectors.map(v => memoryToRecallItem(v, "vector")));
}

// Graph expansion (if enabled)
if (options.enableGraph) {
// ... graph expansion logic
}

// Merge, deduplicate, and rank
return processRecallResults(results);
}

Search Optimization Techniques

1. Limit Before Filter

Conceptual Example

The following shows optimization techniques. The SDK handles these optimizations automatically.

// Slow: Filter all, then limit (conceptual)
const all = await ctx.db.query("memories").collect();
const filtered = all.filter((m) => m.importance >= 80);
const limited = filtered.slice(0, 10);

// Fast: Limit during query (conceptual)
const results = await ctx.db
.query("memories")
.withIndex("by_memorySpace", (q) => q.eq("memorySpaceId", memorySpaceId))
.filter((q) => q.gte(q.field("importance"), 80))
.take(10); // ← Stop after 10 matches

2. Use Specific Indexes

// Generic index (conceptual)
await ctx.db
.query("memories")
.withIndex("by_memorySpace", (q) => q.eq("memorySpaceId", memorySpaceId))
.filter((q) => q.eq(q.field("source.type"), "a2a"))
.collect();

// Specific compound index (conceptual)
await ctx.db
.query("memories")
.withIndex("by_memorySpace_source", (q) =>
q.eq("memorySpaceId", memorySpaceId).eq("source.type", "a2a"),
)
.collect();

3. Cache Frequent Queries

const searchCache = new Map<string, { results: any[]; timestamp: number }>();

export const cachedSearch = query({
handler: async (ctx, args) => {
const cacheKey = JSON.stringify(args);
const cached = searchCache.get(cacheKey);

// Cache for 60 seconds
if (cached && Date.now() - cached.timestamp < 60000) {
return cached.results;
}

// Execute search
const results = await doSearch(ctx, args);

// Cache results
searchCache.set(cacheKey, {
results,
timestamp: Date.now(),
});

return results;
},
});

Real-World Search Patterns

Pattern 1: Conversational Context

// Get recent relevant context for conversation
async function getConversationContext(
memorySpaceId: string,
userId: string,
currentMessage: string,
) {
const embedding = await embed(currentMessage);

const result = await cortex.memory.recall({
memorySpaceId,
query: currentMessage,
embedding,
userId, // User-specific
importance: { $gte: 50 }, // Skip trivial
limit: 5,
});

return result.items;
}

Pattern 2: Knowledge Retrieval

// Find best answer from knowledge base
async function findAnswer(memorySpaceId: string, query: string) {
const embedding = await embed(query);

const result = await cortex.memory.recall({
memorySpaceId,
query,
embedding,
importance: { $gte: 70 }, // Only important articles
tags: ["kb-article"],
limit: 3,
});

return result.items;
}
// Complex search with multiple criteria
async function advancedSearch(memorySpaceId: string, criteria: any) {
const embedding = await embed(criteria.query);

const result = await cortex.memory.recall({
memorySpaceId,
query: criteria.query,
embedding,
userId: criteria.userId,
tags: criteria.tags,
tagMatch: "all", // Must have all tags
importance: { $gte: criteria.minImportance },
createdAfter: criteria.since,
"source.type": criteria.sourceType,
limit: criteria.limit,
});

return result.items;
}

Fallback Strategies

Graceful Degradation

async function searchWithFallback(
memorySpaceId: string,
query: string,
embedding?: number[],
) {
try {
// Use recall() which automatically combines strategies
const result = await cortex.memory.recall({
memorySpaceId,
query: query || "*", // Empty query falls back to recent
embedding, // Optional: enables semantic search if provided
limit: 10,
importance: { $gte: 50 },
});

if (result.items.length > 0) {
return result.items;
}

// Fall back to recent only
const recentResult = await cortex.memory.recall({
memorySpaceId,
query: "*",
sources: {
vector: { limit: 0 }, // Disable vector search
facts: { limit: 0 }, // Disable facts search
},
limit: 10,
importance: { $gte: 50 },
});

return recentResult.items;
} catch (error) {
console.error("All search strategies failed:", error);
return [];
}
}

Performance Benchmarks

Search Latency by Strategy

StrategyIndex UsedTypical LatencyDataset Size Impact
SemanticVector index50-100msLogarithmic (with filterFields)
KeywordSearch index20-50msLogarithmic
RecentRegular index10-20msLogarithmic
HybridMultiple100-200msLogarithmic (parallel)

Throughput

Queries per second (estimated):

  • Semantic: 50-100 QPS (embedding generation is bottleneck)
  • Keyword: 200-500 QPS (no embedding needed)
  • Recent: 500-1000 QPS (simple index lookup)

Optimizations:

  • Cache query embeddings (5-10× faster)
  • Batch embedding generation
  • Use Convex reactive queries (cached by Convex)

Next Steps


Questions? Ask in GitHub Discussions.