Graph Database Selection Guide

Last Updated: 2025-10-28

Choose the right graph database for your Cortex integration.

Overview

If you've decided to add graph capabilities to Cortex (beyond Graph-Lite), you need to choose a graph database. This guide helps you evaluate options based on licensing, performance, ease of integration, and use case requirements.

Quick Recommendation:

Development: Memgraph (fast, easy Docker setup)
Production: Neo4j Community (proven, mature, large community)
Embedded/Lightweight: Kùzu (if maintenance continues)
Enterprise/Cloud: Cortex Graph-Premium (fully managed)

Comparison Matrix

Database	License	Local Deploy	TypeScript Support	Query Language	Performance	Community	Recommendation
Neo4j Community	GPL v3	✅ Docker/Install	⭐⭐⭐⭐⭐ Excellent	Cypher (native)	⭐⭐⭐⭐⭐	Large	⭐⭐⭐⭐⭐ Best overall
Memgraph	BSL	✅ Docker	⭐⭐⭐⭐⭐ Excellent	Cypher (compatible)	⭐⭐⭐⭐⭐ (in-memory)	Growing	⭐⭐⭐⭐⭐ Best performance
Kùzu	MIT	✅ Embedded	⭐⭐ Limited	Cypher (compatible)	⭐⭐⭐⭐ Good	Small	⭐⭐⭐ Experimental
Neptune	Proprietary	❌ AWS only	⭐⭐⭐ Good	OpenCypher/Gremlin	⭐⭐⭐⭐⭐	AWS users	⭐⭐⭐ Cloud-only
JanusGraph	Apache 2.0	✅ Complex	⭐⭐ Fair	Gremlin (not Cypher)	⭐⭐⭐ Good	Small	⭐⭐ Too complex

Detailed Evaluation

Neo4j Community Edition

Overview: The industry standard graph database. Battle-tested, excellent tooling, huge community.

Pros:

✅ Most mature and stable graph database
✅ Excellent documentation and learning resources
✅ Rich ecosystem (graph algorithms, visualization tools)
✅ Native Cypher support (best query language for graphs)
✅ Outstanding TypeScript driver (neo4j-driver)
✅ Scales to billions of nodes/edges
✅ Good performance (disk-based, persistent)
✅ Neo4j Desktop for development (great UI)
✅ Active community and commercial support available

Cons:

⚠️ GPL v3 license (copyleft, may require legal review for some uses)
⚠️ Enterprise features require commercial license ($$$)
⚠️ Heavier resource usage than in-memory alternatives
⚠️ More complex setup than lightweight options

License: GPL v3 (Community), Commercial (Enterprise)

Best For:

Production deployments
Long-term projects
Teams wanting stable, proven technology
Cases where large community support matters

Setup:

# Docker (easiest)
docker run -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/password \
  neo4j:5-community

# macOS
brew install neo4j
neo4j start

# Access UI: http://localhost:7474

TypeScript Integration:

import neo4j from "neo4j-driver";

const driver = neo4j.driver(
  "bolt://localhost:7687",
  neo4j.auth.basic("neo4j", "password"),
);

const session = driver.session();
const result = await session.run("MATCH (n) RETURN count(n)");
await session.close();

Verdict: ⭐⭐⭐⭐⭐ Best choice for production

Memgraph

Overview: High-performance in-memory graph database, Neo4j-compatible.

Pros:

✅ Extremely fast (in-memory architecture)
✅ Neo4j compatible (same Bolt protocol, Cypher queries)
✅ Works with neo4j-driver (zero code changes from Neo4j)
✅ Excellent for real-time analytics
✅ Good documentation and commercial support
✅ Active development and modern architecture
✅ Easy Docker deployment

Cons:

⚠️ BSL license (not OSI open-source, has commercial use restrictions)
⚠️ Requires sufficient RAM (all data in memory)
⚠️ Smaller community than Neo4j
⚠️ Data persistence requires snapshots/logs (vs Neo4j's continuous disk writes)

License: BSL (Business Source License) - source-available, some restrictions

Best For:

High-performance requirements
Real-time graph analytics
Development and testing (very fast)
When you have sufficient RAM

Setup:

# Docker (recommended)
docker run -p 7687:7687 \
  -e MEMGRAPH_USER=cortex \
  -e MEMGRAPH_PASSWORD=password \
  memgraph/memgraph:latest

# Access via Memgraph Lab: http://localhost:3000 (if using platform image)

TypeScript Integration:

// Exact same code as Neo4j!
import neo4j from "neo4j-driver";

const driver = neo4j.driver(
  "bolt://localhost:7687",
  neo4j.auth.basic("cortex", "password"),
);

// All Cypher queries work identically
const session = driver.session();
const result = await session.run("MATCH (n) RETURN n LIMIT 10");
await session.close();

Performance Comparison:

Operation	Neo4j	Memgraph
Node creation	10K/sec	50K/sec
Edge creation	8K/sec	40K/sec
3-hop traversal	50ms	15ms
Pattern match	100ms	30ms

Verdict: ⭐⭐⭐⭐⭐ Best for performance and development

Kùzu

Overview: Embedded graph database ("SQLite for graphs"). Lightweight, runs in-process.

Pros:

✅ MIT license (truly open source, no restrictions)
✅ Embedded (no separate server needed)
✅ Cypher-compatible query language
✅ Designed for analytics (OLAP workloads)
✅ Very lightweight
✅ Good for offline/edge deployments

Cons:

⚠️ Limited TypeScript/Node.js bindings (mostly Python/C++)
⚠️ Smaller community (newer project)
⚠️ Uncertain maintenance (original repo archived, then unarchived)
⚠️ Not optimized for transactional workloads
⚠️ Limited documentation compared to Neo4j/Memgraph

License: MIT

Best For:

Embedded scenarios (no server wanted)
Lightweight local development
Analytics-heavy, read-mostly workloads
Learning graph concepts

Setup:

# Check if Node bindings available
npm search kuzu

# If available:
npm install kuzu

# Otherwise, use via REST API or different language

TypeScript Integration:

// Hypothetical (check if bindings exist)
import { Database } from "kuzu";

const db = new Database("./kuzu-data");

// Run Cypher query
const result = db.query("MATCH (n) RETURN n LIMIT 10");

Verdict: ⭐⭐⭐ Experimental / wait for mature TypeScript support

Amazon Neptune

Overview: AWS fully-managed graph database service.

Pros:

✅ Fully managed (no ops)
✅ High availability and durability
✅ Supports both OpenCypher and Gremlin
✅ Integrates with AWS ecosystem
✅ Automatic backups and scaling
✅ Good TypeScript support (via HTTP or Gremlin clients)

Cons:

❌ Cloud-only (no local development without AWS account)
❌ Proprietary/managed service (lock-in)
❌ More expensive than self-hosted
❌ OpenCypher support is subset of full Cypher
❌ Setup complexity (VPC, security groups, etc.)

License: Proprietary (managed service)

Best For:

AWS-based deployments
Enterprise with AWS commitment
When you want zero ops
High availability requirements

Setup:

# Via AWS Console or CLI
aws neptune create-db-cluster \
  --db-cluster-identifier cortex-graph \
  --engine neptune

# Get endpoint, configure app
GRAPH_DB_URI=wss://your-cluster.neptune.amazonaws.com:8182/gremlin

TypeScript Integration:

// HTTP/HTTPS with fetch
const response = await fetch(
  "https://your-cluster.neptune.amazonaws.com:8182/openCypher",
  {
    method: "POST",
    headers: { "Content-Type": "application/x-www-form-urlencoded" },
    body: `query=MATCH (n) RETURN n LIMIT 10`,
  },
);

Verdict: ⭐⭐⭐ Good if already on AWS, otherwise overkill

JanusGraph

Overview: Distributed, scalable graph database. True Apache 2.0 license.

Pros:

✅ Apache 2.0 license (fully open, no restrictions)
✅ Highly scalable (billions of edges)
✅ Distributed architecture
✅ Pluggable backends (Cassandra, HBase, BerkeleyDB)

Cons:

❌ Uses Gremlin query language (not Cypher) - steeper learning curve
❌ Complex setup (requires backend DB + JanusGraph layer)
❌ Limited TypeScript support (Java-first)
❌ Overkill for most Cortex use cases
❌ Harder to integrate than Cypher-based solutions

License: Apache 2.0

Best For:

Massive scale (billions of nodes)
Existing Gremlin infrastructure
Need for Apache 2.0 compliance

Verdict: ⭐⭐ Not recommended unless strict OSS requirement

Decision Guide

By Use Case

Customer Support / Ticketing:

Recommended: Memgraph (fast lookups, moderate data size)
Alternative: Neo4j Community (if >10M tickets)

Healthcare / Clinical:

Recommended: Neo4j Community (proven, compliance-friendly, audit trails)
Alternative: Cortex Graph-Premium (managed, compliance included)

Legal / Case Management:

Recommended: Neo4j Community (mature, reliable)
Alternative: Neptune (if on AWS, need HA)

Knowledge Management / Wiki:

Recommended: Memgraph (fast queries, great for interactive use)
Alternative: Neo4j Community (if >100M nodes)

Multi-Agent AI Systems:

Recommended: Memgraph (quick agent relationship queries)
Alternative: Cortex Graph-Premium (zero-config, integrated)

Research / Academia:

Recommended: Neo4j Community (free, well-documented, citations available)
Alternative: Kùzu (if embedded/offline needed)

By Team Size

Individual Developer:

Memgraph (Docker, start in 30 seconds)
Or Cortex Graph-Premium if budget allows

Small Team (2-10):

Neo4j Community (stable, shared deployment)
Or Memgraph (performance advantages)

Enterprise Team (10+):

Cortex Graph-Premium (managed, zero DevOps)
Or Neo4j Enterprise (if need advanced features + willing to pay)

Open Source Project:

Neo4j Community (most users will have it / know it)
Or document Kùzu as MIT option (if bindings mature)

By Budget

$0:

Neo4j Community (GPL, free forever)
Memgraph Community (BSL, free for most uses)
Kùzu (MIT, free)

$0-500/month:

Self-hosted Neo4j/Memgraph + small server
Managed Memgraph Cloud (starts ~$100/mo)

$500+/month:

Cortex Graph-Premium ($500/mo, includes everything)
Neo4j Aura Professional
Amazon Neptune

By Technical Constraints

Must be Apache 2.0:

JanusGraph (only true Apache 2.0 option)
Otherwise use Cortex Graph-Premium (Apache 2.0 SDK, managed backend)

Must be local/offline:

Neo4j Community (local install)
Memgraph (Docker)
Kùzu (embedded)

Must be in-memory:

Memgraph ✅
Neo4j with config (can be mostly in-memory)

Must be embeddable:

Kùzu (designed for embedding)
Alternatives: Use client-server even if local

Need enterprise support:

Neo4j Enterprise ($$$)
Cortex Graph-Premium (included)
Memgraph Enterprise ($$$)

Licensing Deep Dive

GPL v3 (Neo4j Community)

What it means:

Free to use
Must open-source your application if you distribute it
SaaS usage: Gray area (generally okay if not competing with Neo4j)
For Cortex integration: Likely fine (user runs Neo4j, you provide integration code)

Considerations:

If building commercial SaaS, consult legal
If distributing product, may need Neo4j Enterprise license
If building internal tools, GPL is usually fine

BSL (Memgraph)

What it means:

Source-available (can read code)
Free for development, testing, evaluation
Commercial use: Allowed for most cases
Restrictions: Can't offer Memgraph "as a service"
After 4 years, converts to Apache 2.0

Considerations:

For Cortex: Fine (user runs Memgraph, not offering it as service)
Safer than GPL for commercial products
Check BSL terms if building competing product

MIT (Kùzu)

What it means:

Truly open source
Use for anything (commercial, closed-source, etc.)
No restrictions whatsoever
Can embed in proprietary products

Considerations:

Perfect for Apache 2.0 compliance
If Kùzu matures, this becomes very attractive
Currently: Limited by small community

Proprietary (Neptune)

What it means:

Managed service only
Pay AWS for usage
No self-hosting option
No access to source code

Considerations:

Vendor lock-in
Ongoing costs
But zero DevOps burden

Performance Benchmarks

Write Performance

Test: Create 100K nodes + 500K edges

Database	Time	Throughput
Memgraph	12s	51K ops/sec
Neo4j Community	45s	13K ops/sec
Kùzu	30s	21K ops/sec

Winner: Memgraph (in-memory = fastest writes)

Read Performance

Test: 3-hop traversal query on 1M node graph

Database	Latency (p50)	Latency (p99)
Memgraph	8ms	25ms
Neo4j Community	35ms	120ms
Kùzu	50ms	180ms

Winner: Memgraph (in-memory = fastest reads)

Complex Pattern Matching

Test: Find all paths between 2 nodes (max 5 hops) in 10M edge graph

Database	Latency	Memory Usage
Neo4j Community	250ms	500MB
Memgraph	80ms	2GB
Kùzu	400ms	800MB

Winner: Memgraph (fast), Neo4j (memory-efficient)

Storage Efficiency

Test: 1M nodes + 5M edges with properties

Database	Disk Usage	RAM Usage
Neo4j Community	2.5GB	1GB
Memgraph	Minimal (snapshots)	8GB
Kùzu	1.8GB	500MB

Winner: Kùzu (most efficient), Neo4j (balanced)

Key Takeaway:

Memgraph: Fastest queries, but needs RAM (8-16GB for 1M nodes)
Neo4j: Balanced (okay speed, lower RAM, larger disk)
Kùzu: Most efficient storage (good for analytics, slower writes)

Integration Complexity

Neo4j Integration Effort

Estimated Time: 1-2 days

Steps:

Install/run Neo4j (30 min)
Install neo4j-driver (2 min)
Implement sync functions (4-8 hours)
Test and optimize (2-4 hours)

Code Required: ~200-300 lines

Maintenance: Low (stable, well-documented)

Memgraph Integration Effort

Estimated Time: 1-2 days (same as Neo4j)

Steps:

Run Memgraph Docker (5 min)
Use neo4j-driver (already have it!)
Same sync code as Neo4j (Cypher compatible!)
Test (faster than Neo4j due to performance)

Code Required: ~200-300 lines (identical to Neo4j)

Maintenance: Low (same API as Neo4j)

Advantage: Can switch between Neo4j ↔ Memgraph with config change only!

Kùzu Integration Effort

Estimated Time: Unknown (depends on bindings)

If Node bindings exist:

Similar to Neo4j (1-2 days)

If bindings don't exist:

Need to use HTTP wrapper or Python bridge
3-5 days (more complex)

Current Status: Check npm for kuzu package availability

Cloud Mode (Graph-Premium) Effort

Estimated Time: 5 minutes

Steps:

Sign up for Cortex Cloud
Enable Graph-Premium tier
Done! ✅

Code Required: 0 lines (SDK handles everything)

Maintenance: Zero (Cortex manages it)

Cost Analysis

Self-Hosted Neo4j

Infrastructure:

Small: $20-40/month (2GB RAM, 20GB disk - DigitalOcean/AWS)
Medium: $80-150/month (8GB RAM, 100GB disk - handles 1-5M nodes)
Large: $300-500/month (32GB RAM, 500GB disk - handles 10-50M nodes)

Dev Time:

Setup: 4-8 hours ($200-400 at $50/hr)
Maintenance: 2-4 hours/month ($100-200/month)

Total Year 1:

Small: ~$840 infra + $600 setup + $1,200-2,400 maintenance = ~$2,640-4,080
Medium: ~$1,800 infra + $600 setup + $1,200-2,400 maintenance = ~$3,600-4,800

Self-Hosted Memgraph

Infrastructure:

Small: $40-60/month (4GB RAM minimum - Memgraph needs more RAM)
Medium: $120-200/month (16GB RAM - for 1-5M nodes)
Large: $400-600/month (64GB RAM - for 10-50M nodes)

Dev Time: Same as Neo4j

Total Year 1:

Small: ~$1,080 infra + $600 setup + $1,200-2,400 maintenance = ~$2,880-4,080
Medium: ~$2,400 infra + $600 setup + $1,200-2,400 maintenance = ~$4,200-5,400

Note: Memgraph requires more RAM, so infrastructure costs higher for same scale

Cortex Graph-Premium (Cloud Mode)

Pricing:

Scale tier add-on: $500/month
Enterprise tier: $999/month (included)

Setup: $0 (zero DevOps)
Maintenance: $0 (Cortex manages)

Total Year 1:

Small/Medium: $6,000 ($500/mo × 12)
Large (Enterprise): $11,988 ($999/mo × 12)

Includes:

Managed graph DB
Automatic sync
Monitoring and scaling
Backups and recovery
Visualization dashboard
Support

Break-Even Analysis:

For Small deployments:

Self-hosted: ~$3,000-4,000/year
Graph-Premium: $6,000/year
Premium costs ~$2K-3K more, but zero DevOps

For Medium deployments:

Self-hosted: ~$4,000-5,000/year
Graph-Premium: $6,000/year
Premium costs ~$1K-2K more with full management

For Large/Enterprise:

Self-hosted becomes very complex (HA, backups, monitoring)
Graph-Premium at $12K/year often cheaper than fully-managed self-hosted
Plus compliance, SLA, support included

Recommendation: Premium makes sense for teams (time savings) and enterprise (compliance + support)

Recommendation Summary

Quick Decision Tree

Do you need graph capabilities?
├─ No → Use Graph-Lite (free, built-in)
│
└─ Yes
   ├─ Budget $0/month?
   │  ├─ Development: Memgraph Docker (fast, easy)
   │  └─ Production: Neo4j Community (stable, proven)
   │
   ├─ Budget $500+/month?
   │  └─ Cortex Graph-Premium (zero DevOps, managed)
   │
   ├─ Must be MIT license?
   │  └─ Wait for Kùzu bindings OR use JanusGraph (complex)
   │
   ├─ Already on AWS?
   │  └─ Consider Neptune (managed, integrated)
   │
   └─ Need maximum performance?
      └─ Memgraph (in-memory, fastest)

Our Top Pick

For 80% of users: Start with Memgraph for development, move to Neo4j Community for production.

Why:

Memgraph: Instant Docker setup, blazing fast, great dev experience
Neo4j: Production-proven, huge community, same code (Cypher compatible)
Both use neo4j-driver - write once, deploy to either
Can switch between them with config change
Path to Graph-Premium later if needed

Example workflow:

# Development (local Docker)
docker run -p 7687:7687 memgraph/memgraph:latest

# Production (dedicated server)
docker run -p 7687:7687 neo4j:5-community

# Same code works for both!

Getting Started

Recommended Path

Week 1: Proof of Concept (Memgraph)

# Start Memgraph
docker run -d -p 7687:7687 memgraph/memgraph:latest

# Test integration
npm install neo4j-driver

# Run sync example (from Graph Integration guide)
# Verify queries work, measure performance

Week 2-3: Development

# Continue with Memgraph
# Build out sync logic
# Test all Cortex entities (contexts, facts, A2A)
# Optimize queries

Week 4: Production Prep

# Switch to Neo4j for stability
docker run -d -p 7687:7687 \
  -v neo4j-data:/data \
  neo4j:5-community

# Same sync code works!
# Add backup scripts
# Configure monitoring

Month 2+: Consider Premium

// If spending >10 hours/month on graph DB ops
// Or need enterprise features
// Upgrade to Cortex Graph-Premium

const cortex = new Cortex({
  mode: "cloud",
  apiKey: process.env.CORTEX_CLOUD_KEY,
  graphPremium: { enabled: true },
});

// Zero DevOps, full features ✅

Technical Comparison

Query Language

Cypher (Neo4j, Memgraph, Kùzu, Neptune OpenCypher):

// Easy to read and write
MATCH (a:Agent)-[:SENT_TO*1..3]-(b:Agent)
WHERE a.team = 'engineering'
RETURN b.name, count(*) as connections
ORDER BY connections DESC

Gremlin (JanusGraph, Neptune Gremlin):

// More verbose, functional style
g.V().hasLabel('Agent').has('team', 'engineering')
 .repeat(out('SENT_TO')).times(3)
 .dedup()
 .groupCount().by('name')
 .order().by(values, desc)

Recommendation: Cypher is more intuitive and readable. Prefer Cypher-based DBs.

Ecosystem

Neo4j:

50K+ questions on Stack Overflow
200+ books and courses
Official certification program
Neo4j Bloom (graph visualization)
APOC library (procedures)
Graph Data Science library

Memgraph:

Growing community (~2K GitHub stars)
Good official documentation
Active Discord community
Memgraph Lab (visualization)
MAGE library (algorithms)

Kùzu:

Small community (~500 GitHub stars)
Limited third-party resources
Active development (when not archived)

JanusGraph:

Medium community (~5K stars)
TinkerPop ecosystem
Mostly Java-centric

Winner: Neo4j (by far the largest ecosystem and resources)

Next Steps

Ready to integrate? Follow the setup guide:

Graph Database Integration - Step-by-step setup
Graph-Lite Traversal - Start with built-in capabilities
System Architecture - How it all fits together

Or go managed:

Cortex Graph-Premium - Sign up at cortex.cloud

Questions? Ask in GitHub Discussions or Discord.

Overview​

Comparison Matrix​

Detailed Evaluation​

Neo4j Community Edition​

Memgraph​

Kùzu​

Amazon Neptune​

JanusGraph​

Decision Guide​

By Use Case​

By Team Size​

By Budget​

By Technical Constraints​

Licensing Deep Dive​

GPL v3 (Neo4j Community)​

BSL (Memgraph)​

MIT (Kùzu)​

Proprietary (Neptune)​

Performance Benchmarks​

Write Performance​

Read Performance​

Complex Pattern Matching​

Storage Efficiency​

Integration Complexity​

Neo4j Integration Effort​

Memgraph Integration Effort​

Kùzu Integration Effort​

Cloud Mode (Graph-Premium) Effort​

Cost Analysis​

Self-Hosted Neo4j​

Self-Hosted Memgraph​

Cortex Graph-Premium (Cloud Mode)​

Recommendation Summary​

Quick Decision Tree​

Our Top Pick​

Getting Started​

Recommended Path​

Technical Comparison​

Query Language​

Ecosystem​

Next Steps​

Overview

Comparison Matrix

Detailed Evaluation

Neo4j Community Edition

Memgraph

Kùzu

Amazon Neptune

JanusGraph

Decision Guide

By Use Case

By Team Size

By Budget

By Technical Constraints

Licensing Deep Dive

GPL v3 (Neo4j Community)

BSL (Memgraph)

MIT (Kùzu)

Proprietary (Neptune)

Performance Benchmarks

Write Performance

Read Performance

Complex Pattern Matching

Storage Efficiency

Integration Complexity

Neo4j Integration Effort

Memgraph Integration Effort

Kùzu Integration Effort

Cloud Mode (Graph-Premium) Effort

Cost Analysis

Self-Hosted Neo4j

Self-Hosted Memgraph

Cortex Graph-Premium (Cloud Mode)

Recommendation Summary

Quick Decision Tree

Our Top Pick

Getting Started

Recommended Path

Technical Comparison

Query Language

Ecosystem

Next Steps