Graph Database Selection Guide
Last Updated: 2025-10-28
Choose the right graph database for your Cortex integration.
Overview
If you've decided to add graph capabilities to Cortex (beyond Graph-Lite), you need to choose a graph database. This guide helps you evaluate options based on licensing, performance, ease of integration, and use case requirements.
Quick Recommendation:
- Development: Memgraph (fast, easy Docker setup)
- Production: Neo4j Community (proven, mature, large community)
- Embedded/Lightweight: Kùzu (if maintenance continues)
- Enterprise/Cloud: Cortex Graph-Premium (fully managed)
Comparison Matrix
| Database | License | Local Deploy | TypeScript Support | Query Language | Performance | Community | Recommendation |
|---|---|---|---|---|---|---|---|
| Neo4j Community | GPL v3 | ✅ Docker/Install | ⭐⭐⭐⭐⭐ Excellent | Cypher (native) | ⭐⭐⭐⭐⭐ | Large | ⭐⭐⭐⭐⭐ Best overall |
| Memgraph | BSL | ✅ Docker | ⭐⭐⭐⭐⭐ Excellent | Cypher (compatible) | ⭐⭐⭐⭐⭐ (in-memory) | Growing | ⭐⭐⭐⭐⭐ Best performance |
| Kùzu | MIT | ✅ Embedded | ⭐⭐ Limited | Cypher (compatible) | ⭐⭐⭐⭐ Good | Small | ⭐⭐⭐ Experimental |
| Neptune | Proprietary | ❌ AWS only | ⭐⭐⭐ Good | OpenCypher/Gremlin | ⭐⭐⭐⭐⭐ | AWS users | ⭐⭐⭐ Cloud-only |
| JanusGraph | Apache 2.0 | ✅ Complex | ⭐⭐ Fair | Gremlin (not Cypher) | ⭐⭐⭐ Good | Small | ⭐⭐ Too complex |
Detailed Evaluation
Neo4j Community Edition
Overview: The industry standard graph database. Battle-tested, excellent tooling, huge community.
Pros:
- ✅ Most mature and stable graph database
- ✅ Excellent documentation and learning resources
- ✅ Rich ecosystem (graph algorithms, visualization tools)
- ✅ Native Cypher support (best query language for graphs)
- ✅ Outstanding TypeScript driver (
neo4j-driver) - ✅ Scales to billions of nodes/edges
- ✅ Good performance (disk-based, persistent)
- ✅ Neo4j Desktop for development (great UI)
- ✅ Active community and commercial support available
Cons:
- ⚠️ GPL v3 license (copyleft, may require legal review for some uses)
- ⚠️ Enterprise features require commercial license ($$$)
- ⚠️ Heavier resource usage than in-memory alternatives
- ⚠️ More complex setup than lightweight options
License: GPL v3 (Community), Commercial (Enterprise)
Best For:
- Production deployments
- Long-term projects
- Teams wanting stable, proven technology
- Cases where large community support matters
Setup:
# Docker (easiest)
docker run -p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/password \
neo4j:5-community
# macOS
brew install neo4j
neo4j start
# Access UI: http://localhost:7474
TypeScript Integration:
import neo4j from "neo4j-driver";
const driver = neo4j.driver(
"bolt://localhost:7687",
neo4j.auth.basic("neo4j", "password"),
);
const session = driver.session();
const result = await session.run("MATCH (n) RETURN count(n)");
await session.close();
Verdict: ⭐⭐⭐⭐⭐ Best choice for production
Memgraph
Overview: High-performance in-memory graph database, Neo4j-compatible.
Pros:
- ✅ Extremely fast (in-memory architecture)
- ✅ Neo4j compatible (same Bolt protocol, Cypher queries)
- ✅ Works with
neo4j-driver(zero code changes from Neo4j) - ✅ Excellent for real-time analytics
- ✅ Good documentation and commercial support
- ✅ Active development and modern architecture
- ✅ Easy Docker deployment
Cons:
- ⚠️ BSL license (not OSI open-source, has commercial use restrictions)
- ⚠️ Requires sufficient RAM (all data in memory)
- ⚠️ Smaller community than Neo4j
- ⚠️ Data persistence requires snapshots/logs (vs Neo4j's continuous disk writes)
License: BSL (Business Source License) - source-available, some restrictions
Best For:
- High-performance requirements
- Real-time graph analytics
- Development and testing (very fast)
- When you have sufficient RAM
Setup:
# Docker (recommended)
docker run -p 7687:7687 \
-e MEMGRAPH_USER=cortex \
-e MEMGRAPH_PASSWORD=password \
memgraph/memgraph:latest
# Access via Memgraph Lab: http://localhost:3000 (if using platform image)
TypeScript Integration:
// Exact same code as Neo4j!
import neo4j from "neo4j-driver";
const driver = neo4j.driver(
"bolt://localhost:7687",
neo4j.auth.basic("cortex", "password"),
);
// All Cypher queries work identically
const session = driver.session();
const result = await session.run("MATCH (n) RETURN n LIMIT 10");
await session.close();
Performance Comparison:
| Operation | Neo4j | Memgraph |
|---|---|---|
| Node creation | 10K/sec | 50K/sec |
| Edge creation | 8K/sec | 40K/sec |
| 3-hop traversal | 50ms | 15ms |
| Pattern match | 100ms | 30ms |
Verdict: ⭐⭐⭐⭐⭐ Best for performance and development
Kùzu
Overview: Embedded graph database ("SQLite for graphs"). Lightweight, runs in-process.
Pros:
- ✅ MIT license (truly open source, no restrictions)
- ✅ Embedded (no separate server needed)
- ✅ Cypher-compatible query language
- ✅ Designed for analytics (OLAP workloads)
- ✅ Very lightweight
- ✅ Good for offline/edge deployments
Cons:
- ⚠️ Limited TypeScript/Node.js bindings (mostly Python/C++)
- ⚠️ Smaller community (newer project)
- ⚠️ Uncertain maintenance (original repo archived, then unarchived)
- ⚠️ Not optimized for transactional workloads
- ⚠️ Limited documentation compared to Neo4j/Memgraph
License: MIT
Best For:
- Embedded scenarios (no server wanted)
- Lightweight local development
- Analytics-heavy, read-mostly workloads
- Learning graph concepts
Setup:
# Check if Node bindings available
npm search kuzu
# If available:
npm install kuzu
# Otherwise, use via REST API or different language
TypeScript Integration:
// Hypothetical (check if bindings exist)
import { Database } from "kuzu";
const db = new Database("./kuzu-data");
// Run Cypher query
const result = db.query("MATCH (n) RETURN n LIMIT 10");
Verdict: ⭐⭐⭐ Experimental / wait for mature TypeScript support
Amazon Neptune
Overview: AWS fully-managed graph database service.
Pros:
- ✅ Fully managed (no ops)
- ✅ High availability and durability
- ✅ Supports both OpenCypher and Gremlin
- ✅ Integrates with AWS ecosystem
- ✅ Automatic backups and scaling
- ✅ Good TypeScript support (via HTTP or Gremlin clients)
Cons:
- ❌ Cloud-only (no local development without AWS account)
- ❌ Proprietary/managed service (lock-in)
- ❌ More expensive than self-hosted
- ❌ OpenCypher support is subset of full Cypher
- ❌ Setup complexity (VPC, security groups, etc.)
License: Proprietary (managed service)
Best For:
- AWS-based deployments
- Enterprise with AWS commitment
- When you want zero ops
- High availability requirements
Setup:
# Via AWS Console or CLI
aws neptune create-db-cluster \
--db-cluster-identifier cortex-graph \
--engine neptune
# Get endpoint, configure app
GRAPH_DB_URI=wss://your-cluster.neptune.amazonaws.com:8182/gremlin
TypeScript Integration:
// HTTP/HTTPS with fetch
const response = await fetch(
"https://your-cluster.neptune.amazonaws.com:8182/openCypher",
{
method: "POST",
headers: { "Content-Type": "application/x-www-form-urlencoded" },
body: `query=MATCH (n) RETURN n LIMIT 10`,
},
);
Verdict: ⭐⭐⭐ Good if already on AWS, otherwise overkill
JanusGraph
Overview: Distributed, scalable graph database. True Apache 2.0 license.
Pros:
- ✅ Apache 2.0 license (fully open, no restrictions)
- ✅ Highly scalable (billions of edges)
- ✅ Distributed architecture
- ✅ Pluggable backends (Cassandra, HBase, BerkeleyDB)
Cons:
- ❌ Uses Gremlin query language (not Cypher) - steeper learning curve
- ❌ Complex setup (requires backend DB + JanusGraph layer)
- ❌ Limited TypeScript support (Java-first)
- ❌ Overkill for most Cortex use cases
- ❌ Harder to integrate than Cypher-based solutions
License: Apache 2.0
Best For:
- Massive scale (billions of nodes)
- Existing Gremlin infrastructure
- Need for Apache 2.0 compliance
Verdict: ⭐⭐ Not recommended unless strict OSS requirement
Decision Guide
By Use Case
Customer Support / Ticketing:
- Recommended: Memgraph (fast lookups, moderate data size)
- Alternative: Neo4j Community (if >10M tickets)
Healthcare / Clinical:
- Recommended: Neo4j Community (proven, compliance-friendly, audit trails)
- Alternative: Cortex Graph-Premium (managed, compliance included)
Legal / Case Management:
- Recommended: Neo4j Community (mature, reliable)
- Alternative: Neptune (if on AWS, need HA)
Knowledge Management / Wiki:
- Recommended: Memgraph (fast queries, great for interactive use)
- Alternative: Neo4j Community (if >100M nodes)
Multi-Agent AI Systems:
- Recommended: Memgraph (quick agent relationship queries)
- Alternative: Cortex Graph-Premium (zero-config, integrated)
Research / Academia:
- Recommended: Neo4j Community (free, well-documented, citations available)
- Alternative: Kùzu (if embedded/offline needed)
By Team Size
Individual Developer:
- Memgraph (Docker, start in 30 seconds)
- Or Cortex Graph-Premium if budget allows
Small Team (2-10):
- Neo4j Community (stable, shared deployment)
- Or Memgraph (performance advantages)
Enterprise Team (10+):
- Cortex Graph-Premium (managed, zero DevOps)
- Or Neo4j Enterprise (if need advanced features + willing to pay)
Open Source Project:
- Neo4j Community (most users will have it / know it)
- Or document Kùzu as MIT option (if bindings mature)
By Budget
$0:
- Neo4j Community (GPL, free forever)
- Memgraph Community (BSL, free for most uses)
- Kùzu (MIT, free)
$0-500/month:
- Self-hosted Neo4j/Memgraph + small server
- Managed Memgraph Cloud (starts ~$100/mo)
$500+/month:
- Cortex Graph-Premium ($500/mo, includes everything)
- Neo4j Aura Professional
- Amazon Neptune
By Technical Constraints
Must be Apache 2.0:
- JanusGraph (only true Apache 2.0 option)
- Otherwise use Cortex Graph-Premium (Apache 2.0 SDK, managed backend)
Must be local/offline:
- Neo4j Community (local install)
- Memgraph (Docker)
- Kùzu (embedded)
Must be in-memory:
- Memgraph ✅
- Neo4j with config (can be mostly in-memory)
Must be embeddable:
- Kùzu (designed for embedding)
- Alternatives: Use client-server even if local
Need enterprise support:
- Neo4j Enterprise ($$$)
- Cortex Graph-Premium (included)
- Memgraph Enterprise ($$$)
Licensing Deep Dive
GPL v3 (Neo4j Community)
What it means:
- Free to use
- Must open-source your application if you distribute it
- SaaS usage: Gray area (generally okay if not competing with Neo4j)
- For Cortex integration: Likely fine (user runs Neo4j, you provide integration code)
Considerations:
- If building commercial SaaS, consult legal
- If distributing product, may need Neo4j Enterprise license
- If building internal tools, GPL is usually fine
BSL (Memgraph)
What it means:
- Source-available (can read code)
- Free for development, testing, evaluation
- Commercial use: Allowed for most cases
- Restrictions: Can't offer Memgraph "as a service"
- After 4 years, converts to Apache 2.0
Considerations:
- For Cortex: Fine (user runs Memgraph, not offering it as service)
- Safer than GPL for commercial products
- Check BSL terms if building competing product
MIT (Kùzu)
What it means:
- Truly open source
- Use for anything (commercial, closed-source, etc.)
- No restrictions whatsoever
- Can embed in proprietary products
Considerations:
- Perfect for Apache 2.0 compliance
- If Kùzu matures, this becomes very attractive
- Currently: Limited by small community
Proprietary (Neptune)
What it means:
- Managed service only
- Pay AWS for usage
- No self-hosting option
- No access to source code
Considerations:
- Vendor lock-in
- Ongoing costs
- But zero DevOps burden
Performance Benchmarks
Write Performance
Test: Create 100K nodes + 500K edges
| Database | Time | Throughput |
|---|---|---|
| Memgraph | 12s | 51K ops/sec |
| Neo4j Community | 45s | 13K ops/sec |
| Kùzu | 30s | 21K ops/sec |
Winner: Memgraph (in-memory = fastest writes)
Read Performance
Test: 3-hop traversal query on 1M node graph
| Database | Latency (p50) | Latency (p99) |
|---|---|---|
| Memgraph | 8ms | 25ms |
| Neo4j Community | 35ms | 120ms |
| Kùzu | 50ms | 180ms |
Winner: Memgraph (in-memory = fastest reads)
Complex Pattern Matching
Test: Find all paths between 2 nodes (max 5 hops) in 10M edge graph
| Database | Latency | Memory Usage |
|---|---|---|
| Neo4j Community | 250ms | 500MB |
| Memgraph | 80ms | 2GB |
| Kùzu | 400ms | 800MB |
Winner: Memgraph (fast), Neo4j (memory-efficient)
Storage Efficiency
Test: 1M nodes + 5M edges with properties
| Database | Disk Usage | RAM Usage |
|---|---|---|
| Neo4j Community | 2.5GB | 1GB |
| Memgraph | Minimal (snapshots) | 8GB |
| Kùzu | 1.8GB | 500MB |
Winner: Kùzu (most efficient), Neo4j (balanced)
Key Takeaway:
- Memgraph: Fastest queries, but needs RAM (8-16GB for 1M nodes)
- Neo4j: Balanced (okay speed, lower RAM, larger disk)
- Kùzu: Most efficient storage (good for analytics, slower writes)
Integration Complexity
Neo4j Integration Effort
Estimated Time: 1-2 days
Steps:
- Install/run Neo4j (30 min)
- Install
neo4j-driver(2 min) - Implement sync functions (4-8 hours)
- Test and optimize (2-4 hours)
Code Required: ~200-300 lines
Maintenance: Low (stable, well-documented)
Memgraph Integration Effort
Estimated Time: 1-2 days (same as Neo4j)
Steps:
- Run Memgraph Docker (5 min)
- Use
neo4j-driver(already have it!) - Same sync code as Neo4j (Cypher compatible!)
- Test (faster than Neo4j due to performance)
Code Required: ~200-300 lines (identical to Neo4j)
Maintenance: Low (same API as Neo4j)
Advantage: Can switch between Neo4j ↔ Memgraph with config change only!
Kùzu Integration Effort
Estimated Time: Unknown (depends on bindings)
If Node bindings exist:
- Similar to Neo4j (1-2 days)
If bindings don't exist:
- Need to use HTTP wrapper or Python bridge
- 3-5 days (more complex)
Current Status: Check npm for kuzu package availability
Cloud Mode (Graph-Premium) Effort
Estimated Time: 5 minutes
Steps:
- Sign up for Cortex Cloud
- Enable Graph-Premium tier
- Done! ✅
Code Required: 0 lines (SDK handles everything)
Maintenance: Zero (Cortex manages it)
Cost Analysis
Self-Hosted Neo4j
Infrastructure:
- Small: $20-40/month (2GB RAM, 20GB disk - DigitalOcean/AWS)
- Medium: $80-150/month (8GB RAM, 100GB disk - handles 1-5M nodes)
- Large: $300-500/month (32GB RAM, 500GB disk - handles 10-50M nodes)
Dev Time:
- Setup: 4-8 hours ($200-400 at $50/hr)
- Maintenance: 2-4 hours/month ($100-200/month)
Total Year 1:
- Small: ~$840 infra + $600 setup + $1,200-2,400 maintenance = ~$2,640-4,080
- Medium: ~$1,800 infra + $600 setup + $1,200-2,400 maintenance = ~$3,600-4,800
Self-Hosted Memgraph
Infrastructure:
- Small: $40-60/month (4GB RAM minimum - Memgraph needs more RAM)
- Medium: $120-200/month (16GB RAM - for 1-5M nodes)
- Large: $400-600/month (64GB RAM - for 10-50M nodes)
Dev Time: Same as Neo4j
Total Year 1:
- Small: ~$1,080 infra + $600 setup + $1,200-2,400 maintenance = ~$2,880-4,080
- Medium: ~$2,400 infra + $600 setup + $1,200-2,400 maintenance = ~$4,200-5,400
Note: Memgraph requires more RAM, so infrastructure costs higher for same scale
Cortex Graph-Premium (Cloud Mode)
Pricing:
- Scale tier add-on: $500/month
- Enterprise tier: $999/month (included)
Setup: $0 (zero DevOps)
Maintenance: $0 (Cortex manages)
Total Year 1:
- Small/Medium: $6,000 ($500/mo × 12)
- Large (Enterprise): $11,988 ($999/mo × 12)
Includes:
- Managed graph DB
- Automatic sync
- Monitoring and scaling
- Backups and recovery
- Visualization dashboard
- Support
Break-Even Analysis:
For Small deployments:
- Self-hosted: ~$3,000-4,000/year
- Graph-Premium: $6,000/year
- Premium costs ~$2K-3K more, but zero DevOps
For Medium deployments:
- Self-hosted: ~$4,000-5,000/year
- Graph-Premium: $6,000/year
- Premium costs ~$1K-2K more with full management
For Large/Enterprise:
- Self-hosted becomes very complex (HA, backups, monitoring)
- Graph-Premium at $12K/year often cheaper than fully-managed self-hosted
- Plus compliance, SLA, support included
Recommendation: Premium makes sense for teams (time savings) and enterprise (compliance + support)
Recommendation Summary
Quick Decision Tree
Do you need graph capabilities?
├─ No → Use Graph-Lite (free, built-in)
│
└─ Yes
├─ Budget $0/month?
│ ├─ Development: Memgraph Docker (fast, easy)
│ └─ Production: Neo4j Community (stable, proven)
│
├─ Budget $500+/month?
│ └─ Cortex Graph-Premium (zero DevOps, managed)
│
├─ Must be MIT license?
│ └─ Wait for Kùzu bindings OR use JanusGraph (complex)
│
├─ Already on AWS?
│ └─ Consider Neptune (managed, integrated)
│
└─ Need maximum performance?
└─ Memgraph (in-memory, fastest)
Our Top Pick
For 80% of users: Start with Memgraph for development, move to Neo4j Community for production.
Why:
- Memgraph: Instant Docker setup, blazing fast, great dev experience
- Neo4j: Production-proven, huge community, same code (Cypher compatible)
- Both use
neo4j-driver- write once, deploy to either - Can switch between them with config change
- Path to Graph-Premium later if needed
Example workflow:
# Development (local Docker)
docker run -p 7687:7687 memgraph/memgraph:latest
# Production (dedicated server)
docker run -p 7687:7687 neo4j:5-community
# Same code works for both!
Getting Started
Recommended Path
Week 1: Proof of Concept (Memgraph)
# Start Memgraph
docker run -d -p 7687:7687 memgraph/memgraph:latest
# Test integration
npm install neo4j-driver
# Run sync example (from Graph Integration guide)
# Verify queries work, measure performance
Week 2-3: Development
# Continue with Memgraph
# Build out sync logic
# Test all Cortex entities (contexts, facts, A2A)
# Optimize queries
Week 4: Production Prep
# Switch to Neo4j for stability
docker run -d -p 7687:7687 \
-v neo4j-data:/data \
neo4j:5-community
# Same sync code works!
# Add backup scripts
# Configure monitoring
Month 2+: Consider Premium
// If spending >10 hours/month on graph DB ops
// Or need enterprise features
// Upgrade to Cortex Graph-Premium
const cortex = new Cortex({
mode: "cloud",
apiKey: process.env.CORTEX_CLOUD_KEY,
graphPremium: { enabled: true },
});
// Zero DevOps, full features ✅
Technical Comparison
Query Language
Cypher (Neo4j, Memgraph, Kùzu, Neptune OpenCypher):
// Easy to read and write
MATCH (a:Agent)-[:SENT_TO*1..3]-(b:Agent)
WHERE a.team = 'engineering'
RETURN b.name, count(*) as connections
ORDER BY connections DESC
Gremlin (JanusGraph, Neptune Gremlin):
// More verbose, functional style
g.V().hasLabel('Agent').has('team', 'engineering')
.repeat(out('SENT_TO')).times(3)
.dedup()
.groupCount().by('name')
.order().by(values, desc)
Recommendation: Cypher is more intuitive and readable. Prefer Cypher-based DBs.
Ecosystem
Neo4j:
- 50K+ questions on Stack Overflow
- 200+ books and courses
- Official certification program
- Neo4j Bloom (graph visualization)
- APOC library (procedures)
- Graph Data Science library
Memgraph:
- Growing community (~2K GitHub stars)
- Good official documentation
- Active Discord community
- Memgraph Lab (visualization)
- MAGE library (algorithms)
Kùzu:
- Small community (~500 GitHub stars)
- Limited third-party resources
- Active development (when not archived)
JanusGraph:
- Medium community (~5K stars)
- TinkerPop ecosystem
- Mostly Java-centric
Winner: Neo4j (by far the largest ecosystem and resources)
Next Steps
Ready to integrate? Follow the setup guide:
- Graph Database Integration - Step-by-step setup
- Graph-Lite Traversal - Start with built-in capabilities
- System Architecture - How it all fits together
Or go managed:
- Cortex Graph-Premium - Sign up at cortex.cloud
Questions? Ask in GitHub Discussions or Discord.