✓ Live on Google Play

Documentation

How VEKTOR Notes works

VEKTOR Notes is a local-first AI note-taking app for Android. Everything — your notes, memories, and the AI models processing them — runs on your device. No cloud uploads. No tracking.

The interface has two modes you swipe between: JOT for capture and CHAT for retrieval. Under the hood, a four-layer memory graph (MAGMA) builds a persistent, searchable representation of everything you write.

AndroidFreeNo adsLocal-firstMAGMA graph

Download on Google Play ← Back to Notes

Getting started

First launch

Install from Google Play and open the app. You land directly on the JOT surface — a blank text area. No account required. No onboarding wizard. The app is immediately functional for capture and local storage without any configuration.

To enable AI features (ghost suggestions and CHAT responses), you need to add an API key from a supported LLM provider.

Configuration

Setting your API key

VEKTOR Notes uses your own provider keys. Your key is stored in SecureStore on-device — it never touches VEKTOR servers. API calls go directly from your device to the provider.

Open Settings

Tap the Settings icon in the bottom toolbar.

Choose a provider

Anthropic (Claude), OpenAI (GPT), Gemini, or Groq. Each has different speed and cost tradeoffs.

Paste your API key

Copy your key from the provider dashboard. The app validates and stores it locally in encrypted SecureStore.

Optionally route tasks to different models

The model picker lets you use a cheaper model for JOT ghost suggestions and a more capable model for CHAT synthesis.

Your API key is stored in Android's encrypted SecureStore. It is never sent to VEKTOR servers. You are not paying VEKTOR for LLM usage — costs go directly to your provider.

Supported providers

Provider	Best for	Cost guide
Anthropic Claude Haiku	CHAT — best memory synthesis quality	~$0.003/1K tokens
OpenAI GPT-4o-mini	Balanced JOT + CHAT	~$0.0002/1K tokens
Groq LLaMA	Fastest JOT ghost suggestions	Free tier available
Gemini Flash	Long context CHAT	Free tier available

Core modes

JOT — Zen capture

JOT is the primary writing surface. A clean text area with no menus, no formatting toolbar, no friction. You type. The app watches quietly. After 900 milliseconds of silence — long enough to fire only when you have genuinely paused, short enough that the suggestion arrives before you lose the thread — the ghost suggestion engine offers a completion.

Ghost suggestions

Ghost suggestions appear as faded text beneath your current content. Two actions only:

Tap to append the suggestion. Ghost text becomes real text.

Dismiss

Tap elsewhere or keep typing. The suggestion clears with no trace.

Suggestions are deliberately capped at 30 words or fewer. If you want to develop an idea further, that is what CHAT is for.

JOT actions

Action	What it does
Synthesise	Extracts key ideas and surfaces one unexpected connection. Terse, no preamble.
Expand	Adds one concrete example and one implication. Stays in your voice.
Clean	Fixes grammar, tightens prose, keeps every idea. Returns only the cleaned text.
Connect	Names 2–3 concepts this note links to and explains why.

Save paths

Quick Save

Routes raw text directly into the memory database. No extraction. Text preserved verbatim, importance score 0.75.

Synthesise + Save

Triggers a structured LLM call: extracts title, tags, entities, a summary, and layer classification. Writes as a full MAGMA node. Raw text also persists.

Core modes

CHAT — Talk to your memory

Swipe left from JOT to enter CHAT mode. Ask questions. The app searches everything you have written using a dual-channel retrieval pipeline and answers from your own context — not the internet.

“What did I think about the authentication bug last week?”
“What connects my notes on sleep and my notes on decision-making?”
“What was I working on before I made that architecture decision?”
“Summarise everything I've written about LLM memory systems.”

Core modes

GRAPH — See connections

Swipe up from JOT to open the memory graph. Every saved memory is a node. Every inferred relationship is an edge. Nodes are coloured by layer type.

Semantic

Facts, ideas, concepts. Connected by similarity.

Temporal

Events and sequences. Preserves ordering in time.

Causal

Cause-and-effect, reasoning chains, decisions.

Entity

People, organisations, projects, locations.

How it works

The retrieval pipeline

When you ask something in CHAT, two parallel retrieval paths run before the LLM sees your question:

BM25 keyword search

Your query is tokenised and run against the SQLite FTS5 index. Single-digit milliseconds. Excels at exact term matching.

Vector similarity search

Your query is embedded and the sqlite-vec ANN index returns the k most semantically similar nodes. Bridges vocabulary gaps — “login problems” finds “authentication flow breaking in production.”

Reciprocal Rank Fusion

Both result sets are merged using RRF — each result scores 1/(k+rank) across all lists, k=60. Documents appearing highly in multiple lists score best.

Context injection

Top 5–10 fused memory nodes are prefixed to your question in the LLM prompt. The LLM sees your question + curated relevant memories only.

Architecture

MAGMA — The memory graph

MAGMA is the four-layer graph architecture that separates VEKTOR Notes from a notes app with a chat window bolted on. Each layer represents a different type of relationship, which determines how memories are retrieved.

Semantic layer

Facts, ideas, concepts. Relationships persist as explicit graph edges — not vector lookups recomputed on every query.

Temporal layer

Events, sequences, time-based context. Preserves ordering. Answers “what was I thinking before that decision?”

Causal layer

Cause and effect, reasoning chains. Edges are directional and typed. The layer most agent memory systems skip.

Entity layer

People, organisations, projects, locations. Named entities get their own nodes with edges to every memory that mentions them.

Storage

The entire graph lives in a single SQLite file on your device. No cloud database. No GPU. Queries execute in milliseconds and back up with your device backup.

notes        -- raw note content + timestamps
memories     -- MAGMA nodes (content, layer, importance)
edges        -- typed directional relationships
entities     -- extracted named entities

FTS5 index   -- BM25 full-text search
vec_memories -- float32 embeddings (sqlite-vec ANN)

Advanced

MCP bridge (Pro)

The MCP bridge connects VEKTOR Notes to Claude Desktop, Cursor, or any MCP-compatible agent on your computer over your local Tailscale network. Your phone's memory becomes available to desktop AI tools.

Install Tailscale on both devices

Get your machine's Tailscale IP: tailscale ip -4

Run the bridge on your machine

VEKTOR_DB_PATH=/path/to/vektor.db node vektor-notes-mcp-bridge.js

Runs on port 3747 by default.

Connect in Settings

Settings → MCP Bridge → enter 100.x.x.x:3747

Add to Claude Desktop

The bridge prints its MCP config on startup — copy it into claude_desktop_config.json.

Configuration

LLM provider routing

The model picker in Settings lets you route tasks to different models. JOT ghost suggestions can run on a faster, cheaper model than CHAT synthesis — you are not paying GPT-4 rates for a 30-word autocomplete.

Recommended: Groq LLaMA for JOT (fastest, free tier available), Claude Haiku for CHAT (best memory synthesis quality at low cost).

Privacy

What stays on your device

All notes and memories are stored in SQLite on-device only
Your API key is stored in Android's encrypted SecureStore
API calls go directly from your device to your LLM provider — not via VEKTOR servers
VEKTOR does not log your conversations, queries, or note content
No analytics SDK, no ad network
The app works fully offline for capture and local search

We do not own the models this app runs on, and we never will. You configure whichever provider you want, paste in your own API key, and the app uses it directly.

Download Free — Google Play Privacy policy