Name: VEKTOR Memory
Price: 9 USD
Author: VEKTOR Memory

Benchmark · LongMemEval

Built to perform. Verified by benchmark.

0%

Adjusted accuracy

LongMemEval benchmark

Long-context memory recall

0ms

Avg. recall latency

42× faster than cloud

Local SQLite · zero network hop

0%

Graph accuracy

Drift rate near zero

MAGMA graph engine

MAGMA Graph

4-layer memory architecture

Semantic

Causal

Temporal

Entity

BM25 + vector RRF dual-recall

0/31

Causal inference tests

All passing

G-formula · MSM · IV · RCA

Judge GPT-4o-mini

Build Slipstream v1.6.3

Metric Adjusted accuracy

THE PROBLEM

The Problem

Standard RAG is amnesia with extra steps.

WITHOUT VEKTOR // SESSION AMNESIA

SESSION_001

SESSION_002

✗ MEMORY WIPED — context lost

SESSION_003

✗ MEMORY WIPED — starting over again

SESSION_N

✗ Agent has no idea who you are

WITH VEKTOR // ASSOCIATIVE GRAPH

SESSION_001

→ STORED · AUDN: ADD

SESSION_002

→ GRAPH UPDATED +3 NODES +7 EDGES

SESSION_003

→ GRAPH: 247 NODES · 7,180 EDGES

SESSION_N

→ COMPLETE ASSOCIATIVE MEMORY INTACT

// 01 · recall speed

8ms

average recall latency

Instant memory retrieval

Local SQLite lookup — no API roundtrip, no cloud latency. Your agent gets context in 8ms avg, under 50ms p95.

live · cloud vendors avg 200–800ms

// 02 · graph growth

247

nodes · 7,180 edges · growing

Self-organising MAGMA graph

Semantic · Causal · Temporal · Entity. Every remember() call wires new edges. The graph builds itself while your agent works.

live · AUDN curation · zero duplicates

// 03 · rem compression

50:1

fragment compression ratio

Gets smarter while idle

7-phase REM dream cycle runs while your agent sleeps. 50 raw fragments → 1 core insight. 98% noise removed. Signal preserved.

async · never blocks your agent

THE ARCHITECTURE

Architecture

Raw input → AUDN curation → persistent graph.

INPUT_LAYER

Raw Input

Conversation turns, tool outputs, observations. Any unstructured agent context fed in as text.

CONVERSATIONTOOL_OUTPUTOBSERVATION

→

AUDN_LAYER

AUDN Curation

Every memory is evaluated: ADD new info, UPDATE existing, DELETE contradictions, or NO_OP if already known. Zero duplicates.

ADDUPDATEDELETENO_OP

→

MAGMA_LAYER

MAGMA Graph

Persisted across 4 graph types in SQLite. Survives all session resets. REM cycle compresses while idle.

SEMANTICCAUSALTEMPORALENTITY

MAGMA Graph Types

Four layers. One mind.

LAYER_01

Semantic

Similarity between memories. Finds related concepts across your full context history.

LAYER_02

Causal

Cause → Effect relationships. Understands why things happened, not just what.

LAYER_03

Temporal

Before → After sequences. Tracks how knowledge evolves and decays over time.

LAYER_04

Entity

Named entity co-occurrence. Connects people, projects, and events automatically.

The Core Difference

Two paradigms. One winner.

Most vector stores are passive. They store what you put in and return what you ask for. VEKTOR is an active memory layer — it evolves, curates, and reasons about what your agent should remember.

PASSIVE STORE

The File Cabinet

Standard RAG vector stores

Stores vectors. Returns nearest neighbors. That's it.
No understanding of relationships between memories
Grows forever — no curation, no decay, no prioritization
Requires you to engineer retrieval logic from scratch
Cloud dependency, monthly billing, data leaves your server
Retrieves the past. Cannot reason about the present.

MENTAL MODEL A drawer full of notes. You ask, it searches. Nothing more.

VS

ACTIVE MEMORY LAYER

The State Machine

VEKTOR Memory

MAGMA graph maps relationships: semantic, causal, temporal, entity
Memories evolve — importance scores decay, conflicts resolve
Auto-curates: duplicate collapse, contradiction detection, pruning
Retrieval is intelligent: returns what's relevant now, not just similar
Local-first SQLite. $9/month. Your data, your server.
Knows what the agent learned, forgot, and should prioritize next.

MENTAL MODEL A mind that thinks about what it knows — and gets smarter over time.

Skeptical devs ask: "Why not just use a vector store with a wrapper?" Because a vector store wrapper gives your agent a search bar, not a memory. VEKTOR installs once, runs locally, and uses the LLM provider you already pay for — no cloud, no per-call fees.

THE CORE

Core Systems

Built different. By design.

MAGMA · Live Retrieval

Memory recalls in real time

Spec-decoding retrieval — bi-encoder shortlist re-ranked by cross-encoder. Two-stage precision. Ranked, scored, graph-aware.

0.97

user prefers TypeScript over JavaScript

2m ago

0.91

meeting with Sarah — Friday 3pm

14m ago

0.88

project: data pipeline · Python

1h ago

0.74

active: 4,100 edges: 22,496

3h ago

0.61

dreams: 11 — REM last run 04:12

1d ago

REM Compression

Gets smarter while idle

7-phase dream cycle. 50 raw fragments → 1 core insight. 98% noise removed. Signal preserved.

Before REM — 50 raw fragments

After REM — core signal retained

50:1

COMPRESSION

RATIO

SELFORG · Zettelkasten Engine

Graph that wires itself

On every remember() call, a background agent extracts keywords, finds related memories, classifies the edge type — SUPPORTS, EXTENDS, CONTRASTS, PREREQUISITE — and writes a Zettelkasten context note linking it to everything connected. Async. Never blocks your agent.

SUPPORTS EXTENDS CONTRASTS PREREQUISITE

MEMORY_GRAPH // LIVE

AUDN · Autonomous curation

Memory that edits itself

Every new input is evaluated: ADD new info, UPDATE contradictions, DELETE stale facts, or NO_OP if already known. Zero drift. Zero bloat. Graph stays clean automatically.

0

added

0

updated

0

deleted

0

no-op

graph accuracy

99.1%

drift rate

0.00%

memory deviation/cycle

bloat pruned

0 KB

stale data removed

token cost saved

$0.000

vs naive full-context

zero drift · zero bloat

0 ops processed

THE ECOSYSTEM

Integrations

Works with every stack.

LangChain

Drop-in memory layer for LangChain agents.
recall() returns context, remember() stores.
v1 + v2 adapters included.

OpenAI Agents SDK

Persistent memory for OpenAI agent loops.
Recalled context injected into system prompt.
GPT-4o and o-series models supported.

Claude MCP Server

Full MCP module — vektor_recall, vektor_store,
vektor_graph, vektor_delta tools.
Connect Claude Desktop in minutes.

Gemini / Groq / Ollama / OpenRouter

Provider-agnostic single config switch.
Key pooling for Gemini — up to 9 API keys,
waterfall rotation, zero rate-limit downtime.

Mistral MCP

vektor_memoire HTTP tool for Le Chat
and Mistral API agents. Local bridge on
localhost:3847. French-first sovereign memory.

CLOAK

53-tool MCP layer for Claude Desktop.
Stealth browser, credential vault, CAPTCHA solving,
behaviour injection. Zero cloud. One install.

Integration

NEW · v1.6.3

Causal Inference Engine

Your agent now knows why memories are connected — not just what. Four-phase causal engine: G-Formula, MSM/IPW, IV Bounds, and Root Cause Analysis. Traces agent failures backwards through the causal chain, scores root causes by impact, and predicts the fix. No other memory layer does this.

G-Formula · MSM/IPW · IV Bounds · RCA

DeepFlow v2 — Deterministic Research

Deep research that never goes off-script. 8-step deterministic pipeline replaces the old unbounded loop: DECOMPOSE → VAULT-FIRST → SWEEP → LOCI → COMMIT → ADVERSARIAL → SYNTHESISE → CRITIC+PATCH. Every run is auditable, reproducible, and hallucination-resistant.

deep:true · adversarial_search · loci_rank · patch

JOT — Write With Your Memory

Two-pass whitepaper generation via Groq LLaMA with APA7 citation infrastructure. Your notes surface relevant memories as you write. Ghost-text autocomplete, briefing scheduler, post-generation citation scanner. Long-form thinking that lives alongside your agent — all local, all yours.

Notes RAG · Two-pass · APA7 · Briefing

Full changelog →

UPDATED · v1.6.3

JOT — Notes & Writing

Integrated notes layer with TAG pill, notes RAG, and two-pass article generation via Groq LLaMA. APA7 citation infrastructure, post-generation citation scanner, ghost-text autocomplete, and briefing scheduler. Notes live alongside memories in local SQLite — never leaves your machine.

Notes · RAG · Synthesis · Citations · Briefing

View docs →

Install

Drop into any Node.js agent in minutes.

QUICKSTARTjavascript

// 1. Install
// npm install vektor-slipstream

import { createMemory } from 'vektor-slipstream';

// 2. Initialise
const memory = await createMemory({
  provider: 'gemini',
  apiKey:   process.env.GEMINI_API_KEY,
  agentId:  'my-agent',
  dbPath:   './my-agent.db',
});

// 3. Remember — AUDN decides ADD/UPDATE/DELETE
await memory.remember("User prefers TypeScript");

// 4. Recall
const ctx = await memory.recall("coding preferences");

// 5. Traverse the graph
const g = await memory.graph("TypeScript", { hops: 2 });

// 6. What changed in 7 days?
const d = await memory.delta("architecture", 7);

01

No external services

Pure SQLite. No cloud dependency, no API keys for memory. Your memory graph never leaves your server. LLM providers process queries per their own privacy policies.

02

Model agnostic

Claude, Gemini, Groq, Mistral, OpenAI, Ollama, OpenRouter. Switch provider with one config change. Key pooling for Gemini — waterfall rotation across up to 9 keys.

03

AUDN keeps it clean

Automatic curation loop prevents contradictions and duplicates. The graph stays consistent without any manual management.

04

REM Cycle

Background process compresses 50 fragments into 3 core insights. Runs while your agent is idle. Run via vektor rem from the CLI.

Built on Research

Implementation original. Concepts peer-reviewed.

ARXIV // 2601.03236

MAGMA: A Multi-Graph-based Agentic Memory Architecture for AI Agents

The graph type system underpinning VEKTOR's four memory layers — semantic, causal, temporal, and entity.

GRAPH ARCHITECTURE ARXIV // 2601.02163

EverMemOS: A Self-Organizing Memory Operating System for Structured Long-Horizon Reasoning

Self-organizing memory architecture for structured long-horizon reasoning. Informs VEKTOR's REM cycle approach to memory consolidation and lifecycle management across extended agent sessions.

MEMORY LIFECYCLE ARXIV // 2504.19413

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

Scalable long-term memory via dynamic extraction, consolidation, and graph-based retrieval. Informs VEKTOR's AUDN curation loop and REM compression cycle. Mem0 benchmarks show 90%+ token cost reduction — consistent with VEKTOR's synthesis approach.

MEMORY COMPRESSION LETTA.COM

Letta / MemGPT: LLM as Operating System

The memory-as-OS paradigm that inspired VEKTOR's approach to agent context management and recall.

AGENT OS

READ FULL RESEARCH BREAKDOWN →

// THE REAL COST OF AI MEMORY

Two bills.
Or one price. Forever.

Cloud memory APIs charge twice: a subscription for the service, and an embedding API fee on every single store and recall operation. Those embedding calls add up fast — at production agent volume they often exceed the subscription itself. VEKTOR runs on your machine and routes through the LLM provider you already pay for. No second bill. No hidden meter.

// CLOUD MEMORY API

Bill 1 — Monthly subscription
Bill 2 — Embedding fee per operation
Bill 3 — Egress & storage at scale
Your data lives on their servers.

ONGOING COST → GROWS WITH USAGE

// VEKTOR — LOCAL-FIRST

          $9/month — cancel any time.

          Zero embedding fees — uses your provider

          Zero egress — SQLite stays on your machine

          Your graph. Your server. Your rules.
        
FLAT COST → ZERO ONGOING

// NOTE Embedding costs vary by provider and model. At modest agent volume — hundreds of daily memory operations — embedding API charges typically run $5–$40/month on top of any memory subscription. This estimate is illustrative; your actual cost depends on your provider, model, and call frequency. VEKTOR does not eliminate your LLM provider costs — it eliminates the memory subscription and the dedicated embedding overhead on top of it.

// OPEN SOURCE — APACHE 2.0

Vex — Vector Exchange

Cross-standard vector DB migration. Export, import, and migrate agent memory between any vector store using the open .vmig.jsonl interchange format. One file. Any store. No lock-in.

✓Zero re-embedding — pure matrix projection, no API cost

✓12 connectors: Qdrant, Pinecone, Redis, Milvus, Neo4j + more

✓Portable .vmig.jsonl format — vendor-neutral, inspectable

✓Apache 2.0 licensed — use in commercial projects free, forever

12

connectors

$0

API cost

Apache 2.0

licence

VIEW ON GITHUB → READ THE ARTICLE →

npx vex migrate --from vektor --to qdrant

// CONNECTORS

STORE	EXPORT	IMPORT
vektor	✓	✓
jsonl	✓	✓
pinecone	✓	✓
qdrant	✓	✓
chroma	✓	✓
weaviate	✓	✓
pgvector	✓	✓
redis	✓	✓
milvus	✓	✓
neo4j	✓	✓
claude-export	✓	—
chatgpt-export	✓	—

Apache 2.0 · Node.js ≥18 · zero dependencies

// MIGRATION IN PROGRESS

vektor

→

.vmig.jsonl

→

qdrant

ready 0 / 247 records

// NEW — PHASE 4

@vektormemory/vex-adapter

Translate vectors between embedding model spaces using pre-trained linear projection weights — no API calls, no re-embedding, pure matrix multiply. Switch models without losing your memory.

bge-small → text-embedding-3-small bge-small → text-embedding-3-large bge-base → text-embedding-3-small e5-large → text-embedding-3-large + 3 more bundled pairs

npm install -g @vektormemory/vex-adapter

// OPEN SOURCE — APACHE 2.0

Vek-Sync — MCP Config Sync

Keep your MCP server configurations in sync across every AI editor you use. One source of truth for all your mcp.json configs. Edit once, sync everywhere. No drift, no duplication.

✓11 editors supported — Claude, Cursor, VS Code, Windsurf + more

✓AES-256-GCM Passport Vault — credentials encrypted at rest

✓Single mcp.json source of truth — edit once, propagate everywhere

✓Apache 2.0 — free forever, zero cloud dependency

11

editors

1

source file

$0

forever

VIEW ON GITHUB → READ ARTICLE →

npm install -g @vektormemory/vek-sync

// UNIQUE FEATURE

AES-256-GCM Passport Vault

Your MCP credentials — API keys, tokens, secrets — are encrypted at rest using AES-256-GCM with OS-bound key derivation. No plaintext config files. No secrets in git. Credentials travel with the sync, not around it.

AES-256-GCM OS-BOUND KEYS ZERO PLAINTEXT

// CONNECTORS

EDITOR	CONFIG PATH	SYNC
Claude Desktop	Claude Desktop app	✓
Cursor	Cursor editor	✓
VS Code	.vscode/mcp.json	✓
Windsurf	Windsurf by Codeium	✓
Claude Code	Claude Code CLI	✓
Cline	saoudrizwan.claude-dev	✓
Roo Code	rooveterinaryinc.roo-cline	✓
Gemini	Gemini CLI	✓
Copilot	GitHub Copilot CLI	✓
Continue	continue.continue	✓
Codex	Codex CLI — TOML	✓

Apache 2.0 · Node.js ≥18 · zero dependencies

// SYNC IN ACTION

SOURCE → SYNCING → 11 EDITORS

// BLOG ARTICLE

MCP Sync: One Config File to Rule Them All

How Vek-Sync eliminates config drift across every AI editor on your machine.

Read Article →

// OPEN SOURCE — APACHE 2.0

Via — Universal AI Integration

Route context, tasks, and memory across every AI tool you use. Connect Claude, Cursor, Windsurf, ChatGPT, and LangChain to a shared bus — so your work follows you across every tool, every session, every machine.

✓Codebase graph indexing — instant project context for any agent

✓Shared context bus — Claude, Cursor, Windsurf, ChatGPT in sync

✓MCP server with 8 tools — file conversion, watch, scaffold + more

✓Apache 2.0 — free forever, zero cloud dependency

5+

AI tools

8

MCP tools

$0

forever

VIEW ON GITHUB → DOCS →

npm install -g @vektormemory/via

// UNIQUE FEATURE

Codebase Graph Indexing

Via scans your project and builds a token-aware file anatomy index. Every connected agent gets instant context — no manual briefing, no re-explaining. Drop it into any project and every tool knows where everything is.

FILE WATCHER GRAPH INDEX ZERO SETUP

// TOOLS & INTEGRATIONS

TOOL	DESCRIPTION
Claude	Shared context + memory bus
Cursor	Codebase graph + task routing
Windsurf	Session context sync
ChatGPT	Cross-tool memory handoff
LangChain	Agent context injection
File watcher	Auto-index on change
Scaffold	Project structure templates
File convert	Format conversion MCP tool

Apache 2.0 · Node.js ≥18 · zero dependencies

// CONTEXT ROUTING

PROJECT → ROUTING → ALL TOOLS

// OPEN SOURCE

Universal AI Integration Layer

Route context and tasks across Claude, Cursor, Windsurf, ChatGPT, and LangChain from one shared bus.

View on GitHub →

        // FULL PRODUCT — EVERYTHING INCLUDED
      

One price.
Own it forever.

No cloud. No embedding bill. No data handshake.
VEKTOR runs on your machine, under your control, permanently.

Zero-knowledge architecture Self-organising MAGMA graph Spec-decoding retrieval Sovereign identity & Cloak vault Slipstream SDK — npm install $9/month · cancel any time

GET VEKTOR — $9/mo → FULL SPECS →

Your memory graph is a portable SQLite file — no lock-in, ever.

Vector Memory for AI Agents — Local-First MCP Server

Raw Input

AUDN Curation

MAGMA Graph

Semantic

Causal

Temporal

Entity

The File Cabinet

The State Machine

LangChain

OpenAI Agents SDK

Claude MCP Server

Gemini / Groq / Ollama / OpenRouter

Mistral MCP

CLOAK

No external services

Model agnostic

AUDN keeps it clean

REM Cycle

MAGMA: A Multi-Graph-based Agentic Memory Architecture for AI Agents

EverMemOS: A Self-Organizing Memory Operating System for Structured Long-Horizon Reasoning

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

Letta / MemGPT: LLM as Operating System

Two bills.Or one price. Forever.

Vex — Vector Exchange

@vektormemory/vex-adapter

Vek-Sync — MCP Config Sync

Via — Universal AI Integration

One price. Own it forever.

Two bills.
Or one price. Forever.

One price.
Own it forever.