Resilience Layer
All Cortex operations include automatic:
- Rate limiting - Smooths traffic bursts
- Concurrency control - Respects Convex limits (16 concurrent on free plan)
- Circuit breaking - Fast-fails when backend is unhealthy
- Priority queuing - Critical operations execute first
Why It Matters
Without protection, traffic spikes overwhelm your database:
Quick Start
Zero Configuration
Resilience is enabled by default with safe settings for Convex Free plan:
import { Cortex } from '@cortexmemory/sdk';
const cortex = new Cortex({
convexUrl: process.env.CONVEX_URL!,
// Resilience enabled automatically
});
// All operations automatically protected
await cortex.memory.remember({...});
Environment Variable
Set CONVEX_PLAN to auto-configure for your subscription:
# .env
CONVEX_PLAN=free # 16 concurrent (default)
CONVEX_PLAN=professional # 256 concurrent
import { getPresetForPlan } from '@cortexmemory/sdk';
const cortex = new Cortex({
convexUrl: process.env.CONVEX_URL!,
resilience: getPresetForPlan(), // Reads CONVEX_PLAN
});
Configuration Options
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
enabled | boolean | No | true | Enable/disable resilience layer |
rateLimiter.bucketSize | number | No | 100 | Rate limiter burst capacity |
rateLimiter.refillRate | number | No | 50 | Tokens per second (rate limit) |
concurrency.maxConcurrent | number | No | 20 | Max simultaneous operations |
concurrency.queueSize | number | No | 1000 | Max queued requests |
concurrency.timeout | number | No | 30000 | Request timeout (ms) |
circuitBreaker.failureThreshold | number | No | 5 | Failures before circuit opens |
circuitBreaker.successThreshold | number | No | 2 | Successes in half-open to close circuit |
circuitBreaker.timeout | number | No | 30000 | Time to wait in open state before half-open (ms) |
circuitBreaker.halfOpenMax | number | No | 3 | Max test requests allowed in half-open state |
queue.maxSize | object | No | See defaults | Max queue size per priority level (Partial<Record<Priority, number>>) |
retry.maxRetries | number | No | 3 | Maximum number of retry attempts |
retry.baseDelayMs | number | No | 500 | Base delay between retries (ms) |
retry.maxDelayMs | number | No | 10000 | Maximum delay between retries (ms) |
retry.exponentialBase | number | No | 2.0 | Exponential backoff base |
retry.jitter | boolean | No | true | Add jitter to prevent thundering herd |
Custom Configuration
const cortex = new Cortex({
convexUrl: process.env.CONVEX_URL!,
resilience: {
enabled: true,
rateLimiter: {
bucketSize: 100, // Max burst
refillRate: 50, // Per second
},
concurrency: {
maxConcurrent: 16, // Convex free limit
queueSize: 1000,
timeout: 30000, // 30s
},
circuitBreaker: {
failureThreshold: 5, // Open after 5 failures
successThreshold: 2, // Close after 2 successes
timeout: 30000, // 30s recovery wait
halfOpenMax: 3, // Max test requests in half-open
},
queue: {
maxSize: { // Per-priority limits
critical: 100,
high: 500,
normal: 1000,
low: 2000,
background: 5000,
},
},
retry: {
maxRetries: 3, // Retry up to 3 times
baseDelayMs: 500, // Start with 0.5s delay
maxDelayMs: 10000, // Cap at 10s
exponentialBase: 2.0, // Double delay each attempt
jitter: true, // Prevent thundering herd
},
},
});
Presets
Safe defaults for Convex Free/Starter (16 concurrent limit):
import { ResiliencePresets } from '@cortexmemory/sdk';
const cortex = new Cortex({
convexUrl: process.env.CONVEX_URL!,
resilience: ResiliencePresets.default,
});
Settings: Max concurrent: 16, Bucket size: 100, Refill rate: 50/sec, Circuit threshold: 5 failures
Low-latency settings for chat apps:
resilience: ResiliencePresets.realTimeAgent
Settings: Max concurrent: 8 (conservative), Timeout: 5s (fail fast), Circuit threshold: 3 (trip quickly)
High throughput for bulk operations (requires Professional plan):
resilience: ResiliencePresets.batchProcessing
Settings: Max concurrent: 64, Queue size: 10,000, Timeout: 60s
Extreme concurrency for agent swarms (requires Professional plan):
resilience: ResiliencePresets.hiveMode
Settings: Max concurrent: 128, Queue size: 50,000, Timeout: 120s
batchProcessing and hiveMode presets require Convex Professional plan (256+ concurrent limit). Using with Free plan will cause queuing.
Architecture
Four protection layers execute in sequence:
Token Bucket Rate Limiter
Controls request rate. Bucket starts full, requests consume tokens, tokens refill over time.
When tokens depleted or retries occur, requests wait with exponential backoff:
- Attempt 1: Immediate (0ms)
- Attempt 2: ~500ms delay (baseDelayMs)
- Attempt 3: ~1000ms delay (baseDelayMs × 2)
- Attempt 4: ~2000ms delay (baseDelayMs × 4)
Default baseDelayMs: 500ms with jitter (random factor 0.5-1.5x) to prevent thundering herd.
Concurrency Limiter
Controls how many simultaneous requests. Max 16 concurrent on free plan. Excess requests queue by priority.
Priority Queue
| Priority | Operations | Queue Size |
|---|---|---|
critical | Circuit breaker tests | 100 |
high | User-facing reads | 500 |
normal | Standard writes | 1,000 |
low | Analytics | 2,000 |
background | Cleanup | 5,000 |
Circuit Breaker
Prevents cascading failures when backend is unhealthy:
State Transitions:
- CLOSED → OPEN: After 5 failures
- OPEN → HALF-OPEN: After 30s timeout
- HALF-OPEN → CLOSED: After 2 successes
- HALF-OPEN → OPEN: On any failure
Monitoring
const metrics = cortex.getResilienceMetrics();
console.log(metrics);
// {
// rateLimiter: {
// tokensAvailable: 87,
// requestsThrottled: 15,
// avgWaitTimeMs: 234
// },
// concurrency: {
// active: 12,
// waiting: 5,
// maxReached: 16,
// timeouts: 0
// },
// circuitBreaker: {
// state: 'closed',
// failures: 0,
// lastStateChangeAt: 1234567890,
// totalOpens: 0
// },
// queue: {
// total: 5,
// byPriority: {
// critical: 0,
// high: 2,
// normal: 3,
// low: 0,
// background: 0
// },
// processed: 1234,
// dropped: 5,
// oldestRequestAgeMs: 1234
// },
// timestamp: 1234567890
// }
// Alert if circuit opens
if (metrics.circuitBreaker.state === 'open') {
console.error('Backend unhealthy!');
}
// Alert if queue backing up
if (metrics.queue.total > 500) {
console.warn(`Queue depth: ${metrics.queue.total}`);
}
Error Handling
import { CircuitOpenError, QueueFullError, AcquireTimeoutError } from '@cortexmemory/sdk';
try {
await cortex.memory.remember({...});
} catch (error) {
if (error instanceof CircuitOpenError) {
// Backend unhealthy, circuit tripped
console.error('Service unavailable, try later');
} else if (error instanceof QueueFullError) {
// System overloaded
console.error('Too many requests, try later');
} else if (error instanceof AcquireTimeoutError) {
// Waited too long for permit
console.error('Request timed out');
}
}
Best Practices
// Good: Use defaults tuned for Convex Free plan
const cortex = new Cortex({ convexUrl: process.env.CONVEX_URL! });
// Avoid: Over-customizing without measurement
const cortex = new Cortex({
resilience: { concurrency: { maxConcurrent: 100 } } // Exceeds Convex limits!
});
// Free/Starter: Use default or realTimeAgent
resilience: ResiliencePresets.default
// Professional: Can use batchProcessing or hiveMode
resilience: ResiliencePresets.batchProcessing
const cortex = new Cortex({
resilience: {
onCircuitOpen: () => console.error('Circuit OPEN'),
onCircuitClose: () => console.log('Circuit CLOSED'),
},
});
Graceful Shutdown
// Wait for in-flight requests (with optional timeout)
await cortex.shutdown(30000); // Default: 30s timeout
// In server shutdown handler
process.on('SIGTERM', async () => {
await cortex.shutdown();
process.exit(0);
});