VEKTOR is a privacy-first organisation  ·  NO TRACKING  ·  NO COOKIES  ·  NO PAYWALLS  ·  ONLY CLOUDFLARE'S STANDARD PROXIES
Persistent vector memory agentic system

Vector Memory for AI Agents — Local-First MCP Server

// why vektor

Local-first vector memory with a self-organising 4-layer graph. Spec-decoding retrieval. Zettelkasten-linked edges.

Owns your data — zero egress, zero API dependency
8ms recall vs 340ms cloud — 42× faster in production
One-time licence — no monthly embedding bill ever
Remembers why things connect, not just what they are
8ms
avg recall
<50ms
p95 latency
$0
embedding cost
100%
local · zero egress
Cloud memory vendors average 200–800ms recall · VEKTOR: 8ms
$9/month · Cancel any time. Built on peer-reviewed research
// VEKTOR
Get VEKTOR
own it forever
// DOCS
Read
the Docs
// OSS
Vex
GitHub
// OSS
Vek-Sync
GitHub
// memory recall · live
8ms
why buy
vs cloud memory
$0.00
saved this session
recall speed
8ms
avg · cloud: 340ms
accuracy
97.3%
recall precision
Free Install — Google Play
MEM · 0000
Benchmark · LongMemEval
Built to perform. Verified by benchmark.
0%
Adjusted accuracy
LongMemEval benchmark
Long-context memory recall
0ms
Avg. recall latency
42× faster than cloud
Local SQLite · zero network hop
0%
Graph accuracy
Drift rate near zero
MAGMA graph engine
MAGMA Graph
4-layer memory architecture
Semantic
Causal
Temporal
Entity
BM25 + vector RRF dual-recall
0/31
Causal inference tests
All passing
G-formula · MSM · IV · RCA
Judge GPT-4o-mini
Build Slipstream v1.6.3
Metric Adjusted accuracy
// LOCAL-FIRST
Zero cloud.
Zero data leakage.
// PEER-REVIEWED
Built on published
memory research
// COMMERCIAL
Production licence.
Email support.
// OPEN SOURCE
Vex & Vek-Sync
on GitHub
// INTEGRATIONS
LangChain · Claude
OpenAI · Mistral
THE PROBLEM
The Problem
Standard RAG is amnesia with extra steps.
WITHOUT VEKTOR // SESSION AMNESIA
SESSION_001
SESSION_002
✗ MEMORY WIPED — context lost
SESSION_003
✗ MEMORY WIPED — starting over again
SESSION_N
✗ Agent has no idea who you are
WITH VEKTOR // ASSOCIATIVE GRAPH
SESSION_001
→ STORED · AUDN: ADD
SESSION_002
→ GRAPH UPDATED +3 NODES +7 EDGES
SESSION_003
→ GRAPH: 247 NODES · 7,180 EDGES
SESSION_N
→ COMPLETE ASSOCIATIVE MEMORY INTACT
// 01 · recall speed
8ms
average recall latency
Instant memory retrieval
Local SQLite lookup — no API roundtrip, no cloud latency. Your agent gets context in 8ms avg, under 50ms p95.
live · cloud vendors avg 200–800ms
// 02 · graph growth
247
nodes · 7,180 edges · growing
Self-organising MAGMA graph
Semantic · Causal · Temporal · Entity. Every remember() call wires new edges. The graph builds itself while your agent works.
live · AUDN curation · zero duplicates
// 03 · rem compression
50:1
fragment compression ratio
Gets smarter while idle
7-phase REM dream cycle runs while your agent sleeps. 50 raw fragments → 1 core insight. 98% noise removed. Signal preserved.
async · never blocks your agent
THE ARCHITECTURE
Architecture
Raw input → AUDN curation → persistent graph.
INPUT_LAYER

Raw Input

Conversation turns, tool outputs, observations. Any unstructured agent context fed in as text.

CONVERSATIONTOOL_OUTPUTOBSERVATION
AUDN_LAYER

AUDN Curation

Every memory is evaluated: ADD new info, UPDATE existing, DELETE contradictions, or NO_OP if already known. Zero duplicates.

ADDUPDATEDELETENO_OP
MAGMA_LAYER

MAGMA Graph

Persisted across 4 graph types in SQLite. Survives all session resets. REM cycle compresses while idle.

SEMANTICCAUSALTEMPORALENTITY
MAGMA Graph Types
Four layers. One mind.
LAYER_01

Semantic

Similarity between memories. Finds related concepts across your full context history.

LAYER_02

Causal

Cause → Effect relationships. Understands why things happened, not just what.

LAYER_03

Temporal

Before → After sequences. Tracks how knowledge evolves and decays over time.

LAYER_04

Entity

Named entity co-occurrence. Connects people, projects, and events automatically.

The Core Difference
Two paradigms. One winner.

Most vector stores are passive. They store what you put in and return what you ask for. VEKTOR is an active memory layer — it evolves, curates, and reasons about what your agent should remember.

PASSIVE STORE

The File Cabinet

Standard RAG vector stores

  • Stores vectors. Returns nearest neighbors. That's it.
  • No understanding of relationships between memories
  • Grows forever — no curation, no decay, no prioritization
  • Requires you to engineer retrieval logic from scratch
  • Cloud dependency, monthly billing, data leaves your server
  • Retrieves the past. Cannot reason about the present.
MENTAL MODEL A drawer full of notes. You ask, it searches. Nothing more.
VS
ACTIVE MEMORY LAYER

The State Machine

VEKTOR Memory

  • MAGMA graph maps relationships: semantic, causal, temporal, entity
  • Memories evolve — importance scores decay, conflicts resolve
  • Auto-curates: duplicate collapse, contradiction detection, pruning
  • Retrieval is intelligent: returns what's relevant now, not just similar
  • Local-first SQLite. $9/month. Your data, your server.
  • Knows what the agent learned, forgot, and should prioritize next.
MENTAL MODEL A mind that thinks about what it knows — and gets smarter over time.
Skeptical devs ask: "Why not just use a vector store with a wrapper?" Because a vector store wrapper gives your agent a search bar, not a memory. VEKTOR installs once, runs locally, and uses the LLM provider you already pay for — no cloud, no per-call fees.
THE CORE
Core Systems
Built different. By design.
MAGMA · Live Retrieval
Memory recalls in real time
Spec-decoding retrieval — bi-encoder shortlist re-ranked by cross-encoder. Two-stage precision. Ranked, scored, graph-aware.
0.97
user prefers TypeScript over JavaScript
2m ago
0.91
meeting with Sarah — Friday 3pm
14m ago
0.88
project: data pipeline · Python
1h ago
0.74
active: 4,100 edges: 22,496
3h ago
0.61
dreams: 11 — REM last run 04:12
1d ago
REM Compression
Gets smarter while idle
7-phase dream cycle. 50 raw fragments → 1 core insight. 98% noise removed. Signal preserved.
Before REM — 50 raw fragments
After REM — core signal retained
50:1
COMPRESSION
RATIO
SELFORG · Zettelkasten Engine
Graph that wires itself
On every remember() call, a background agent extracts keywords, finds related memories, classifies the edge type — SUPPORTS, EXTENDS, CONTRASTS, PREREQUISITE — and writes a Zettelkasten context note linking it to everything connected. Async. Never blocks your agent.
SUPPORTS EXTENDS CONTRASTS PREREQUISITE
MEMORY_GRAPH // LIVE
ROOT SEMANTIC CAUSAL TEMPORAL ENTITY MEM 001–084 MEM 085–147 MEM 148–215 MEM 216–343 SEMANTIC 340 CAUSAL 190 TEMPORAL 215 ENTITY 127
AUDN · Autonomous curation
Memory that edits itself
Every new input is evaluated: ADD new info, UPDATE contradictions, DELETE stale facts, or NO_OP if already known. Zero drift. Zero bloat. Graph stays clean automatically.
0
added
0
updated
0
deleted
0
no-op
graph accuracy
99.1%
drift rate
0.00%
memory deviation/cycle
bloat pruned
0 KB
stale data removed
token cost saved
$0.000
vs naive full-context
zero drift · zero bloat
0 ops processed
THE ECOSYSTEM
Integrations
Works with every stack.

LangChain

Drop-in memory layer for LangChain agents.
recall() returns context, remember() stores.
v1 + v2 adapters included.

OpenAI Agents SDK

Persistent memory for OpenAI agent loops.
Recalled context injected into system prompt.
GPT-4o and o-series models supported.

Claude MCP Server

Full MCP module — vektor_recall, vektor_store,
vektor_graph, vektor_delta tools.
Connect Claude Desktop in minutes.

Gemini / Groq / Ollama / OpenRouter

Provider-agnostic single config switch.
Key pooling for Gemini — up to 9 API keys,
waterfall rotation, zero rate-limit downtime.

Mistral MCP

vektor_memoire HTTP tool for Le Chat
and Mistral API agents. Local bridge on
localhost:3847. French-first sovereign memory.

CLOAK

53-tool MCP layer for Claude Desktop.
Stealth browser, credential vault, CAPTCHA solving,
behaviour injection. Zero cloud. One install.

Integration
NEW · v1.6.3
Causal Inference Engine

Your agent now knows why memories are connected — not just what. Four-phase causal engine: G-Formula, MSM/IPW, IV Bounds, and Root Cause Analysis. Traces agent failures backwards through the causal chain, scores root causes by impact, and predicts the fix. No other memory layer does this.

G-Formula  ·  MSM/IPW  ·  IV Bounds  ·  RCA
DeepFlow v2 — Deterministic Research

Deep research that never goes off-script. 8-step deterministic pipeline replaces the old unbounded loop: DECOMPOSE → VAULT-FIRST → SWEEP → LOCI → COMMIT → ADVERSARIAL → SYNTHESISE → CRITIC+PATCH. Every run is auditable, reproducible, and hallucination-resistant.

deep:true  ·  adversarial_search  ·  loci_rank  ·  patch
JOT — Write With Your Memory

Two-pass whitepaper generation via Groq LLaMA with APA7 citation infrastructure. Your notes surface relevant memories as you write. Ghost-text autocomplete, briefing scheduler, post-generation citation scanner. Long-form thinking that lives alongside your agent — all local, all yours.

Notes RAG  ·  Two-pass  ·  APA7  ·  Briefing
Full changelog →
UPDATED · v1.6.3
JOT — Notes & Writing

Integrated notes layer with TAG pill, notes RAG, and two-pass article generation via Groq LLaMA. APA7 citation infrastructure, post-generation citation scanner, ghost-text autocomplete, and briefing scheduler. Notes live alongside memories in local SQLite — never leaves your machine.

Notes  ·  RAG  ·  Synthesis  ·  Citations  ·  Briefing
View docs →
Install
Drop into any Node.js agent in minutes.
QUICKSTARTjavascript
// 1. Install
// npm install vektor-slipstream

import { createMemory } from 'vektor-slipstream';

// 2. Initialise
const memory = await createMemory({
  provider: 'gemini',
  apiKey:   process.env.GEMINI_API_KEY,
  agentId:  'my-agent',
  dbPath:   './my-agent.db',
});

// 3. Remember — AUDN decides ADD/UPDATE/DELETE
await memory.remember("User prefers TypeScript");

// 4. Recall
const ctx = await memory.recall("coding preferences");

// 5. Traverse the graph
const g = await memory.graph("TypeScript", { hops: 2 });

// 6. What changed in 7 days?
const d = await memory.delta("architecture", 7);
01

No external services

Pure SQLite. No cloud dependency, no API keys for memory. Your memory graph never leaves your server. LLM providers process queries per their own privacy policies.

02

Model agnostic

Claude, Gemini, Groq, Mistral, OpenAI, Ollama, OpenRouter. Switch provider with one config change. Key pooling for Gemini — waterfall rotation across up to 9 keys.

03

AUDN keeps it clean

Automatic curation loop prevents contradictions and duplicates. The graph stays consistent without any manual management.

04

REM Cycle

Background process compresses 50 fragments into 3 core insights. Runs while your agent is idle. Run via vektor rem from the CLI.

Built on Research
Implementation original. Concepts peer-reviewed.
READ FULL RESEARCH BREAKDOWN →
// THE REAL COST OF AI MEMORY

Two bills.
Or one price. Forever.

Cloud memory APIs charge twice: a subscription for the service, and an embedding API fee on every single store and recall operation. Those embedding calls add up fast — at production agent volume they often exceed the subscription itself. VEKTOR runs on your machine and routes through the LLM provider you already pay for. No second bill. No hidden meter.

// CLOUD MEMORY API
Bill 1 — Monthly subscription
Bill 2 — Embedding fee per operation
Bill 3 — Egress & storage at scale
Your data lives on their servers.
ONGOING COST → GROWS WITH USAGE
// VEKTOR — LOCAL-FIRST
$9/month — cancel any time.
Zero embedding fees — uses your provider
Zero egress — SQLite stays on your machine
Your graph. Your server. Your rules.
FLAT COST → ZERO ONGOING
// NOTE Embedding costs vary by provider and model. At modest agent volume — hundreds of daily memory operations — embedding API charges typically run $5–$40/month on top of any memory subscription. This estimate is illustrative; your actual cost depends on your provider, model, and call frequency. VEKTOR does not eliminate your LLM provider costs — it eliminates the memory subscription and the dedicated embedding overhead on top of it.
// OPEN SOURCE — APACHE 2.0

Vex — Vector Exchange

Cross-standard vector DB migration. Export, import, and migrate agent memory between any vector store using the open .vmig.jsonl interchange format. One file. Any store. No lock-in.

Zero re-embedding — pure matrix projection, no API cost
12 connectors: Qdrant, Pinecone, Redis, Milvus, Neo4j + more
Portable .vmig.jsonl format — vendor-neutral, inspectable
Apache 2.0 licensed — use in commercial projects free, forever
12
connectors
$0
API cost
Apache 2.0
licence
npx vex migrate --from vektor --to qdrant
// CONNECTORS
STORE EXPORT IMPORT
vektor
jsonl
pinecone
qdrant
chroma
weaviate
pgvector
redis
milvus
neo4j
claude-export
chatgpt-export
Apache 2.0 · Node.js ≥18 · zero dependencies
// MIGRATION IN PROGRESS
vektor
.vmig.jsonl
qdrant
ready 0 / 247 records
// NEW — PHASE 4

@vektormemory/vex-adapter

Translate vectors between embedding model spaces using pre-trained linear projection weights — no API calls, no re-embedding, pure matrix multiply. Switch models without losing your memory.

bge-small → text-embedding-3-small bge-small → text-embedding-3-large bge-base → text-embedding-3-small e5-large → text-embedding-3-large + 3 more bundled pairs
npm install -g @vektormemory/vex-adapter
// OPEN SOURCE — APACHE 2.0

Vek-Sync — MCP Config Sync

Keep your MCP server configurations in sync across every AI editor you use. One source of truth for all your mcp.json configs. Edit once, sync everywhere. No drift, no duplication.

11 editors supported — Claude, Cursor, VS Code, Windsurf + more
AES-256-GCM Passport Vault — credentials encrypted at rest
Single mcp.json source of truth — edit once, propagate everywhere
Apache 2.0 — free forever, zero cloud dependency
11
editors
1
source file
$0
forever
npm install -g @vektormemory/vek-sync
// UNIQUE FEATURE
AES-256-GCM Passport Vault

Your MCP credentials — API keys, tokens, secrets — are encrypted at rest using AES-256-GCM with OS-bound key derivation. No plaintext config files. No secrets in git. Credentials travel with the sync, not around it.

AES-256-GCM OS-BOUND KEYS ZERO PLAINTEXT
// CONNECTORS
EDITOR CONFIG PATH SYNC
Claude DesktopClaude Desktop app
CursorCursor editor
VS Code.vscode/mcp.json
WindsurfWindsurf by Codeium
Claude CodeClaude Code CLI
Clinesaoudrizwan.claude-dev
Roo Coderooveterinaryinc.roo-cline
GeminiGemini CLI
CopilotGitHub Copilot CLI
Continuecontinue.continue
CodexCodex CLI — TOML
Apache 2.0 · Node.js ≥18 · zero dependencies
// SYNC IN ACTION
SOURCE → SYNCING → 11 EDITORS
// BLOG ARTICLE
MCP Sync: One Config File to Rule Them All

How Vek-Sync eliminates config drift across every AI editor on your machine.

Read Article →
// OPEN SOURCE — APACHE 2.0

Via — Universal AI Integration

Route context, tasks, and memory across every AI tool you use. Connect Claude, Cursor, Windsurf, ChatGPT, and LangChain to a shared bus — so your work follows you across every tool, every session, every machine.

Codebase graph indexing — instant project context for any agent
Shared context bus — Claude, Cursor, Windsurf, ChatGPT in sync
MCP server with 8 tools — file conversion, watch, scaffold + more
Apache 2.0 — free forever, zero cloud dependency
5+
AI tools
8
MCP tools
$0
forever
npm install -g @vektormemory/via
// UNIQUE FEATURE
Codebase Graph Indexing

Via scans your project and builds a token-aware file anatomy index. Every connected agent gets instant context — no manual briefing, no re-explaining. Drop it into any project and every tool knows where everything is.

FILE WATCHER GRAPH INDEX ZERO SETUP
// TOOLS & INTEGRATIONS
TOOL DESCRIPTION
ClaudeShared context + memory bus
CursorCodebase graph + task routing
WindsurfSession context sync
ChatGPTCross-tool memory handoff
LangChainAgent context injection
File watcherAuto-index on change
ScaffoldProject structure templates
File convertFormat conversion MCP tool
Apache 2.0 · Node.js ≥18 · zero dependencies
// CONTEXT ROUTING
PROJECT → ROUTING → ALL TOOLS
// OPEN SOURCE
Universal AI Integration Layer

Route context and tasks across Claude, Cursor, Windsurf, ChatGPT, and LangChain from one shared bus.

View on GitHub →
// FULL PRODUCT — EVERYTHING INCLUDED

One price.
Own it forever.

No cloud. No embedding bill. No data handshake.
VEKTOR runs on your machine, under your control, permanently.

Zero-knowledge architecture Self-organising MAGMA graph Spec-decoding retrieval Sovereign identity & Cloak vault Slipstream SDK — npm install $9/month · cancel any time
GET VEKTOR — $9/mo → FULL SPECS →

Your memory graph is a portable SQLite file — no lock-in, ever.