VEKTOR vs Letta (MemGPT): AI Agent Memory Compared (2026)

Quick comparison

Feature

VEKTOR

Letta

Storage

Local SQLite — your machine

Letta Cloud or self-host

Memory model

Persistent graph — survives all sessions

Tiered: core/archival/recall memory

Data ownership

100% local — never leaves server

Cloud tier sends data to Letta

Recall latency

8ms avg · <50ms p95

100–500ms (cloud + LLM hop)

Pricing

$9/mo flat

Usage-based cloud tier

MCP server

✓ Native — Claude, Cursor, Windsurf

No first-party MCP server

Primary language

Node.js / TypeScript

Python-first

Long-horizon tasks

Graph traversal across sessions

Purpose-built — MemGPT research backing

Auto-curation

AUDN — ADD/UPDATE/DELETE/NO_OP

Archival search + LLM routing

Background compression

REM cycle — 50:1 async

No equivalent

Embedding cost

$0 — uses your existing LLM key

Billed on cloud tier

Open source

Vex + Vek-Sync OSS / SDK commercial

Letta OSS on GitHub

Architecture

VEKTOR — Persistent MAGMA Graph

VEKTOR stores every agent memory as a node in a 4-layer SQLite graph (semantic, causal, temporal, entity). Memories accumulate across all sessions indefinitely — the graph never resets. The AUDN curation layer prevents bloat by evaluating every new input against existing nodes before writing.

The REM cycle runs asynchronously while the agent is idle, compressing 50 raw fragments into 3 distilled insights. Retrieval is always local: 8ms average, no network hop.

Letta — Tiered Memory (MemGPT)

Letta's architecture comes from the MemGPT research paper. It divides memory into three tiers: core memory (always in-context), recall memory (searchable conversation history), and archival memory (long-term external storage). An LLM-driven routing layer decides what to retrieve and when.

This tiered model excels for agents that carry out long, complex tasks across many turns — the original MemGPT paper showed a 3.4× improvement on long-horizon benchmarks. The trade-off is latency: every archival retrieval involves an LLM classification step before the actual lookup.

The core trade-off

Letta's tiered model is agent-centric — it treats the LLM's context window as the primary interface, routing memory in and out automatically. This works beautifully for deeply stateful agents doing multi-step reasoning.

VEKTOR's model is graph-centric — all memory accumulates in a persistent SQLite graph regardless of what the LLM is doing. Retrieval is explicit: your agent calls memory.recall() and gets back the most relevant context in 8ms. The graph structure (causal, temporal, entity edges) gives you richer retrieval signals than a flat vector search.

Neither is wrong — they solve different problems. Letta shines for autonomous agents with long task horizons. VEKTOR shines for agents that need fast, private, cost-predictable memory in a Node.js MCP-native stack.

Pricing

VEKTOR is $9/month flat regardless of query volume, memory size, or LLM provider. Letta's cloud tier is usage-based — fine for experimentation, unpredictable at scale. Self-hosting Letta is available but requires managing the infrastructure.

MCP support

VEKTOR ships a native MCP server — one config line to connect Claude Desktop, Cursor, Windsurf, or VS Code. Letta doesn't currently have a first-party MCP integration. For teams in MCP-native environments, this is the sharpest practical difference between the two.

When Letta is the better choice

Your agent runs long, autonomous, multi-step tasks. Letta's tiered memory and LLM-routing is purpose-built for exactly this — the MemGPT research proves out the architecture at scale.
You're building in Python. Letta is Python-native with a mature SDK and rich documentation.
You need open-source infrastructure. Letta's core is on GitHub; the OSS community is active.
Managed cloud is acceptable. If data residency isn't a constraint and you don't want to run local processes.

When VEKTOR is the better choice

You're building in Node.js / TypeScript. VEKTOR is native end to end — no adapter layers.
MCP clients are your environment. Claude Desktop, Cursor, Windsurf, Cline — VEKTOR has the only native MCP server in this space.
Data never leaves your machine. Zero egress, local SQLite, your server only.
You need sub-10ms recall. No LLM routing step means 8ms average vs 100–500ms.
Flat, predictable pricing. $9/mo regardless of how heavily your agent queries memory.

Bottom line

Letta is the strongest choice if your agent is running complex autonomous workflows in Python and you need the research-proven tiered memory model. VEKTOR is the strongest choice for Node.js MCP-native stacks where speed, privacy, and pricing predictability matter. They rarely compete head-to-head — pick based on your runtime, not the marketing.

Try VEKTOR

Local-first. 8ms recall. MCP-native. $9/month flat.

Get VEKTOR Read Docs

VEKTOR vs LettaAI Agent Memory (2026)

Quick comparison

Architecture

The core trade-off

Pricing

MCP support

When Letta is the better choice

When VEKTOR is the better choice

Bottom line

Try VEKTOR

VEKTOR vs Letta
AI Agent Memory (2026)