memkeeper · blog

◆ memkeeper

Field notes from building a memory that lasts.

memkeeper is a persistent memory layer for AI agents. These are notes on how it works and why it is built the way it is: retrieval, measurement, local-first design, and the tools around it.

BenchmarksPositioning2026

A harder benchmark, and it still tells you when it can't answer

memkeeper on LongMemEval, where each question carries about fifty sessions of history: competitive answer accuracy, local and zero marginal cost, and the full abstention number, 0.833, the category most leaderboards leave out.
Token economicsTools2026

What your AI coding subscription actually buys

A receipt for your AI coding plan: about 41x list-price-equivalent leverage on a $100 subscription, and the surprise that 96% of the tokens are the model re-reading context it has already seen.
BenchmarksPositioning2026

Cheap, local, and it tells you when it can't answer

memkeeper matches the published answer-accuracy of commercial agent-memory systems on LoCoMo, running locally at zero marginal cost, and reports its abstention behavior, including the adversarial category most leaderboards leave out.
ArchitectureDesign2026

Documents aren't memories

Point your memory layer at your docs folder and it quietly gets worse. Why memkeeper keeps curated memories and ingested documents in separate tiers, with a one-way bridge between them.
BenchmarksEngineering2026

Benchmarking AI agent memory on LoCoMo

Most memory systems assert that they recall well. We ran memkeeper against a long-term conversational QA benchmark and published the numbers, the method, and the script to reproduce them.
RetrievalDesign2026

Why hybrid retrieval beats pure vector search

Vector search is necessary but not sufficient for agent memory. Here is why memkeeper pairs dense embeddings with BM25 and a cross-encoder reranker, and what each stage covers that the others miss.
PositioningDesign2026

Local-first memory for AI agents

Your agent's memory is your context: decisions, preferences, facts about your systems. The default in this space is a hosted vector database. We think the default should be your own machine.
WardenSecurity2026

A capability broker for AI coding agents

We hand coding agents the shell, the filesystem, and the network. Warden is a deny-by-default gate that decides what an agent is actually allowed to do. We logged 27,860 real decisions, listed the evasions it provably stops, and wrote down where its edges are.