✓ Live on Google Play
Documentation

How VEKTOR Notes works

VEKTOR Notes is a local-first AI note-taking app for Android. Everything — your notes, memories, and the AI models processing them — runs on your device. No cloud uploads. No tracking.

The interface has two modes you swipe between: JOT for capture and CHAT for retrieval. Under the hood, a four-layer memory graph (MAGMA) builds a persistent, searchable representation of everything you write.

AndroidFreeNo adsLocal-firstMAGMA graph
Getting started

First launch

Install from Google Play and open the app. You land directly on the JOT surface — a blank text area. No account required. No onboarding wizard. The app is immediately functional for capture and local storage without any configuration.

To enable AI features (ghost suggestions and CHAT responses), you need to add an API key from a supported LLM provider.

Configuration

Setting your API key

VEKTOR Notes uses your own provider keys. Your key is stored in SecureStore on-device — it never touches VEKTOR servers. API calls go directly from your device to the provider.

01
Open Settings
Tap the Settings icon in the bottom toolbar.
02
Choose a provider
Anthropic (Claude), OpenAI (GPT), Gemini, or Groq. Each has different speed and cost tradeoffs.
03
Paste your API key
Copy your key from the provider dashboard. The app validates and stores it locally in encrypted SecureStore.
04
Optionally route tasks to different models
The model picker lets you use a cheaper model for JOT ghost suggestions and a more capable model for CHAT synthesis.

Your API key is stored in Android's encrypted SecureStore. It is never sent to VEKTOR servers. You are not paying VEKTOR for LLM usage — costs go directly to your provider.

Supported providers

ProviderBest forCost guide
Anthropic Claude HaikuCHAT — best memory synthesis quality~$0.003/1K tokens
OpenAI GPT-4o-miniBalanced JOT + CHAT~$0.0002/1K tokens
Groq LLaMAFastest JOT ghost suggestionsFree tier available
Gemini FlashLong context CHATFree tier available
Core modes

JOT — Zen capture

JOT is the primary writing surface. A clean text area with no menus, no formatting toolbar, no friction. You type. The app watches quietly. After 900 milliseconds of silence — long enough to fire only when you have genuinely paused, short enough that the suggestion arrives before you lose the thread — the ghost suggestion engine offers a completion.

Ghost suggestions

Ghost suggestions appear as faded text beneath your current content. Two actions only:

Accept
Tap to append the suggestion. Ghost text becomes real text.
Dismiss
Tap elsewhere or keep typing. The suggestion clears with no trace.

Suggestions are deliberately capped at 30 words or fewer. If you want to develop an idea further, that is what CHAT is for.

JOT actions

ActionWhat it does
SynthesiseExtracts key ideas and surfaces one unexpected connection. Terse, no preamble.
ExpandAdds one concrete example and one implication. Stays in your voice.
CleanFixes grammar, tightens prose, keeps every idea. Returns only the cleaned text.
ConnectNames 2–3 concepts this note links to and explains why.

Save paths

Quick Save
Routes raw text directly into the memory database. No extraction. Text preserved verbatim, importance score 0.75.
Synthesise + Save
Triggers a structured LLM call: extracts title, tags, entities, a summary, and layer classification. Writes as a full MAGMA node. Raw text also persists.
Core modes

CHAT — Talk to your memory

Swipe left from JOT to enter CHAT mode. Ask questions. The app searches everything you have written using a dual-channel retrieval pipeline and answers from your own context — not the internet.

  • “What did I think about the authentication bug last week?”
  • “What connects my notes on sleep and my notes on decision-making?”
  • “What was I working on before I made that architecture decision?”
  • “Summarise everything I've written about LLM memory systems.”
Core modes

GRAPH — See connections

Swipe up from JOT to open the memory graph. Every saved memory is a node. Every inferred relationship is an edge. Nodes are coloured by layer type.

Semantic
Facts, ideas, concepts. Connected by similarity.
Temporal
Events and sequences. Preserves ordering in time.
Causal
Cause-and-effect, reasoning chains, decisions.
Entity
People, organisations, projects, locations.
How it works

The retrieval pipeline

When you ask something in CHAT, two parallel retrieval paths run before the LLM sees your question:

01
BM25 keyword search
Your query is tokenised and run against the SQLite FTS5 index. Single-digit milliseconds. Excels at exact term matching.
02
Vector similarity search
Your query is embedded and the sqlite-vec ANN index returns the k most semantically similar nodes. Bridges vocabulary gaps — “login problems” finds “authentication flow breaking in production.”
03
Reciprocal Rank Fusion
Both result sets are merged using RRF — each result scores 1/(k+rank) across all lists, k=60. Documents appearing highly in multiple lists score best.
04
Context injection
Top 5–10 fused memory nodes are prefixed to your question in the LLM prompt. The LLM sees your question + curated relevant memories only.
Architecture

MAGMA — The memory graph

MAGMA is the four-layer graph architecture that separates VEKTOR Notes from a notes app with a chat window bolted on. Each layer represents a different type of relationship, which determines how memories are retrieved.

Semantic layer
Facts, ideas, concepts. Relationships persist as explicit graph edges — not vector lookups recomputed on every query.
Temporal layer
Events, sequences, time-based context. Preserves ordering. Answers “what was I thinking before that decision?”
Causal layer
Cause and effect, reasoning chains. Edges are directional and typed. The layer most agent memory systems skip.
Entity layer
People, organisations, projects, locations. Named entities get their own nodes with edges to every memory that mentions them.

Storage

The entire graph lives in a single SQLite file on your device. No cloud database. No GPU. Queries execute in milliseconds and back up with your device backup.

notes        -- raw note content + timestamps
memories     -- MAGMA nodes (content, layer, importance)
edges        -- typed directional relationships
entities     -- extracted named entities

FTS5 index   -- BM25 full-text search
vec_memories -- float32 embeddings (sqlite-vec ANN)
Advanced

MCP bridge (Pro)

The MCP bridge connects VEKTOR Notes to Claude Desktop, Cursor, or any MCP-compatible agent on your computer over your local Tailscale network. Your phone's memory becomes available to desktop AI tools.

01
Install Tailscale on both devices
Get your machine's Tailscale IP: tailscale ip -4
02
Run the bridge on your machine
VEKTOR_DB_PATH=/path/to/vektor.db node vektor-notes-mcp-bridge.js
Runs on port 3747 by default.
03
Connect in Settings
Settings → MCP Bridge → enter 100.x.x.x:3747
04
Add to Claude Desktop
The bridge prints its MCP config on startup — copy it into claude_desktop_config.json.
Configuration

LLM provider routing

The model picker in Settings lets you route tasks to different models. JOT ghost suggestions can run on a faster, cheaper model than CHAT synthesis — you are not paying GPT-4 rates for a 30-word autocomplete.

Recommended: Groq LLaMA for JOT (fastest, free tier available), Claude Haiku for CHAT (best memory synthesis quality at low cost).

Privacy

What stays on your device

  • All notes and memories are stored in SQLite on-device only
  • Your API key is stored in Android's encrypted SecureStore
  • API calls go directly from your device to your LLM provider — not via VEKTOR servers
  • VEKTOR does not log your conversations, queries, or note content
  • No analytics SDK, no ad network
  • The app works fully offline for capture and local search

We do not own the models this app runs on, and we never will. You configure whichever provider you want, paste in your own API key, and the app uses it directly.