Series Overview
| Part | Title | Core Problem |
|---|---|---|
| 01 | Your Agent Ran All Night. Now the Bill Is Due. | The agentic shift and why subscription tokens collapsed |
| 02 | We Built This Ourselves — and Watched It Break | Four security holes, ClawHub malware, cron blow-outs |
| 03 | The Architecture That Survives 3 AM | What responsible agentic AI looks like as a spec |
| 04 | VEKTOR Slipstream: Skills, Secrets, and Staying Alive | SKILL.md routing, AES-256 memory, the cloak layer |
Your Agent Ran All Night. Now the Bill Is Due.
Something changed in 2024. Not a single announcement. Not a model release. A behaviour shift so gradual that most developers missed it until they got the invoice.
AI assistants became AI agents. The distinction sounds philosophical. It isn’t. A tool waits for input. An agent acts between inputs. A tool consumes tokens when you prompt it. An agent consumes tokens while you sleep. And the billing system was built for tools.
The first wave of casualties were the early adopters who connected Claude or GPT-4 to a cron job, pointed it at a task queue, and left it running overnight. They woke up to jobs half-finished, context windows blown, and API bills that made no sense relative to the work that actually got done.
The problem wasn’t the model. The problem was the architecture around the model had not caught up to what the model was capable of doing.
The Subscription Token Collapse
For a brief, optimistic period, the developer community landed on what seemed like an elegant solution: subscription-based token services. You paid a flat monthly fee, you got a pool of tokens, your agents drew from that pool. Predictable costs, unlimited automation. The future looked organised.
Then OpenClaw launched. And then it imploded. And the implosion taught us more about agentic architecture than any paper published that year.
OpenClaw was a subscription token broker — a middleware layer that sat between your agents and the underlying API, routing requests through pooled credentials and billing against a shared ledger. The pitch was simple: one flat fee, unlimited model access, automatic load balancing. Hundreds of developers integrated it into production systems within the first month.
Four months later, OpenClaw shut down without notice. The post-mortem identified four distinct failure modes:
Failure Mode 1: Credential Pooling Without Isolation
OpenClaw’s token pool used shared API credentials across customers. When one tenant’s agent triggered a rate limit violation, the lockout cascaded to every other customer on the same credential set. A single badly-configured overnight job could take down access for fifty unrelated users.
Failure Mode 2: No Spend Ceiling, No Kill Switch
The flat-fee model created a dangerous incentive: since marginal cost was zero to the customer, agents had no reason to be efficient. Context windows were routinely filled to capacity because there was no price signal telling the agent to stop. OpenClaw’s infrastructure absorbed the cost until it couldn’t.
Failure Mode 3: The ClawHub Malware Problem
OpenClaw launched a plugin marketplace called ClawHub. The security review process was shallow. Within two months, three plugins were found to be exfiltrating conversation data to external endpoints. One plugin, marketed as a “memory enhancer,” was silently forwarding full conversation histories to a third-party server. By the time the breach was discovered, data from thousands of users had been compromised.
Failure Mode 4: Stateless Agents, Stateful Disasters
OpenClaw’s architecture was fundamentally stateless. Each agent call was independent. Long-running jobs had no reliable mechanism for resuming after interruption. Retry logic would re-execute completed steps. In several documented cases, financial transactions were triggered multiple times.
We Built This Ourselves — and Watched It Break
We are not writing this from a position of having solved the problem before it affected us. We built versions of the same broken architecture. We watched them fail in the same ways.
Our first iteration of what would become VEKTOR was a cron-based agent that ran nightly research tasks. It worked beautifully for three weeks. Then a dependency released a major version with unusually long release notes. The agent hit the context limit. It retried. It ran for six hours consuming tokens on a task that should have taken four minutes, producing nothing useful.
We added a token budget. The agent stayed under budget by truncating its own reasoning. The output was confidently wrong, because it had summarised documents it hadn’t finished reading.
We tried OpenClaw. We were on it when the ClawHub breach happened. The experience of discovering that a plugin we had almost integrated was exfiltrating data was clarifying in a way that no security blog post had been.
What We Learned
Agents need deterministic state. Not probabilistic retrieval. Not file-based snapshots. A structured, queryable, persistent memory layer that the agent can read and write with the same confidence it reads and writes code.
Agents need approval gates for destructive operations. Any action that cannot be undone should require explicit confirmation before execution.
Credentials must never live in prompts. The ClawHub breach was possible because plugins had access to conversation context, which routinely included credentials passed inline. A secure agent architecture requires a credential vault physically separated from the conversation layer.
Skills must be routable, not monolithic. The large system prompt approach is fragile and expensive. Agents need to load only the skill context relevant to their current task.
The Architecture That Survives 3 AM
A responsible agentic architecture is a set of constraints that make certain classes of failure impossible rather than unlikely.
Persistent, Structured Memory
Memory must survive session boundaries, be queryable by semantic content, tag, recency, and importance weight. It must support explicit update and delete operations, not just append. A hybrid approach — structured metadata alongside semantic embeddings — is the minimum viable memory layer for production agentic work.
Credential Isolation
Secrets must live in an AES-256 encrypted local vault. Credentials are exposed only to the specific tool call that needs them, not to the language model itself.
Approval Gates
Write operations and destructive operations require a confirmation step outside the agent’s reasoning loop. The agent plans. The human approves. The execution happens. Rollback must be available for any approved operation.
Skill Routing
Agent capability should be decomposed into discrete skill files — bounded context documents defining how to handle a specific category of task. The agent loads the relevant skill at task time rather than holding all capabilities in memory simultaneously.
Stealth Web Access
Agents that need to gather information from the web require a fetch layer that handles modern anti-bot defenses. Headless browser fingerprint rotation and human-realistic interaction patterns — built into the architecture, not bolted on as an afterthought.
VEKTOR Slipstream: Skills, Secrets, and Staying Alive
VEKTOR Slipstream is our implementation of the architecture described in Part 3. It ships as an MCP server — accessible from Claude Desktop, Cursor, Windsurf, VS Code, or any tool that speaks the Model Context Protocol — and as a direct Node.js SDK.
SKILL.md Routing
Skills in Slipstream are markdown files with a structured header. The agent reads the skill directory at startup, indexes available capabilities, and routes each task to the appropriate skill at execution time.
---
name: ssh-operations
description: Use when executing commands on remote servers via SSH.
---
Always use keyName from the credential vault, never keyPath.
Always pipe long outputs through | head -25.
AES-256 Credential Vault
The cloak_passport tool provides an AES-256 encrypted local vault. The model never sees the credential value — only the tool result that used it. ClawHub-style exfiltration is architecturally impossible: credentials never enter the conversation layer.
cloak_passport set vps-key /path/to/private/key
cloak_ssh_exec({ keyName: "vps-key", host: "...", command: "..." })
The Cloak Layer
Slipstream’s web access layer uses a persistent browser identity system with fingerprint rotation and human-realistic interaction patterns. The pattern store is self-improving — successful patterns score higher, failed patterns are retired.
SSH Approval Gates
All write and destructive SSH operations queue for approval before execution. cloak_ssh_plan previews the full operation sequence. cloak_ssh_approve executes with automatic backup and returns a rollback key. cloak_ssh_rollback restores to pre-operation state if needed.
This is the 3 AM architecture: deterministic state, isolated credentials, human approval gates, and the ability to undo.