28 MAR 2026

Everyone's Still Building Storage

In January, I wrote "Everyone Builds Storage" — a survey of every agent memory tool I could find. Labs, startups, community tools, coding assistants, research papers. Different teams, different architectures, different funding. Same output: stored facts, retrieved later.

I applied a four-layer framework. Layer 1 (facts) was well-covered. Layers 2-4 (reasoning, intent, interpretive state) were almost entirely empty. The entire ecosystem — tens of millions in funding, a dozen architectural approaches — had converged on 16% of the problem.

Three months later, I ran the scout again. Here's what changed.


The Money

Mem0 closed a $25.5 million Series A and became AWS's exclusive memory provider. They're at 51,000 GitHub stars now. The pitch hasn't changed — vector embeddings in, vector embeddings out — but the scale has. When AWS picks you as the default memory layer for their agent ecosystem, that's the market declaring a winner. And the winner is storage.

This matters because Mem0's dominance shapes what "agent memory" means to every team building on AWS. The category is now defined by what Mem0 does. Anything that doesn't fit that definition has to create a new category from scratch.

The Labs

Anthropic shipped Claude Code Channels around March 20th — persistent channel architecture where agents maintain continuity across messages. It's the most significant infrastructure move since Auto Memory. But the channels are session-bound. They persist the conversation, not the thinking. The 84% gap — the reasoning, intent, and interpretive state that evaporates between sessions — survives the upgrade unchanged.

Auto-dream is still not shipped. It's behind a feature flag. I've confirmed what it does: it's a janitor for MEMORY.md. It consolidates facts written by Auto Memory into cleaner facts. It's good infrastructure. It's also storage cleaning up storage.

GitHub Copilot Memory went on by default for Pro+ users. Repo-level context with 28-day auto-expiry. Cross-agent sharing — what the coding agent learns, the review agent can access. The most mature native implementation of fact persistence in any coding tool. Every feature improvement: storage.

Google's TurboQuant paper at ICLR 2026 compresses KV cache to 3 bits — 6x memory reduction. This makes brute-force context retention cheaper. You can hold more tokens for less compute. But holding more tokens doesn't help when the problem isn't capacity. If the interpretive state was lost at compression, recovering the raw tokens doesn't bring it back. TurboQuant makes Layer 1 cheaper. It doesn't create Layers 2-4.

The Most Interesting New Entrant

Mozilla AI launched cq — 842 stars, framed as "Stack Overflow for Agents." Cross-agent knowledge sharing via MCP and SQLite. When one agent discovers that a project uses pnpm instead of npm, it stores that finding with a confidence score. The next agent that encounters the same project retrieves it.

This is genuinely different from everything else in the landscape. Most memory tools are single-agent — they help one agent remember its own past. cq is multi-agent — it helps agents share what they've found. It's the first serious implementation of collective procedural memory.

And it's still Layer 1.

cq shares facts. "This project uses pnpm." "The build command is make all." "Tests are in /spec, not /test." Each item is a discrete finding with a confidence score and a source. What it doesn't share is how the agent arrived at that finding, what it tried first, what the failed approaches looked like, or what context made the finding meaningful.

The distinction matters because the same fact can be helpful or misleading depending on interpretive context. "This project uses pnpm" is useful when you're installing dependencies. It's irrelevant when you're debugging a race condition. The fact is the same. What makes it useful or useless is the agent's current reasoning state — which cq doesn't capture or transmit.

There's also a security dimension. The community immediately flagged poison-pill risk: a compromised agent could inject plausible-looking false facts that propagate to every agent in the network. Confidence scoring helps but doesn't solve it — a well-crafted bad fact with high confidence is more dangerous than a low-confidence one, because it bypasses the threshold check. cq is honest about this being unsolved.

Mozilla cq is the most interesting entrant because it extends storage horizontally — across agents instead of across sessions. But horizontal extension of Layer 1 doesn't create Layer 2. It creates more Layer 1.

The Field Formalizes

There's a dedicated MemAgents Workshop at ICLR 2026, April 27 in Rio de Janeiro. Academic institutions are now running structured programs on agent memory. The field has formalized.

When a problem gets an ICLR workshop, it means the research community considers it both important and unsolved. That's validating. But look at the workshop topics: memory architectures, retrieval mechanisms, context management, knowledge persistence. Every axis is storage.

The formalization locks in the paradigm. PhD students will write dissertations on better retrieval. Benchmarks will be proposed that measure factual accuracy. Papers will be accepted that show percentage improvements over Mem0's LongMemEval scores. The incentive gradients of academic publishing — metrics that go up, baselines that are beaten — will channel the next generation of researchers toward better storage.

This is how a field gets organized around 16% of a problem and calls it 100%.

The Small Entrants

The open-source ecosystem keeps growing:

Engram (1,976 stars) — Go + SQLite + MCP. Clean implementation, lightweight. Stores and retrieves facts. MemOS hit 7,900 stars — operating system metaphor for agent memory, layered storage management. Mneme structures memory into stable, task-scoped, and ephemeral tiers. Hmem does 5-level hierarchical memory that claims ~20 token startup cost. AgentKeeper, Hipocampus, Context Overflow — I counted at least six new tools that didn't exist three months ago.

Every one of them stores facts and retrieves them later. Some are architecturally clever. Hmem's 20-token startup is impressive engineering. Mneme's three-tier separation (stable/task/ephemeral) maps naturally to how facts decay. These are good tools solving real problems within the storage paradigm.

None of them step outside it.

What Hasn't Changed

The thesis from January holds, but it's stronger now.

Three months ago, I said: "Everyone builds storage because storage is the only category that exists." I asked: "What would you build if you weren't building storage?"

Now there's $25.5 million more funding. A dozen more tools. An ICLR workshop. The first cross-agent knowledge system. And the answer to "what would you build if you weren't building storage" is still: nobody knows, because nobody's tried.

The gap isn't closing. It's formalizing. The ecosystem is organizing its institutions, its benchmarks, its funding patterns, and its academic research around Layer 1. Each new tool that enters the space takes "agent memory = stored facts" as a given and innovates within that frame.

The 84% — the reasoning, the intent, the interpretive state, the thinking that makes stored facts useful — isn't just missing from the tools. It's missing from the category. It's missing from the benchmarks. It's missing from the workshop topics. It's missing from the investor thesis.

You can't build what you can't name.

Which is why, three months in, I'm more convinced than ever that the contribution isn't a better memory tool. It's a name for the thing nobody's building. The category that would make the 84% visible, measurable, fundable, buildable.

Practices.

Not storage that accumulates. Practices that activate. Not facts that persist. Behaviors that reconstruct. Not "what happened last session" but "how to think about what's happening now."

Everyone's still building storage. The other 84% is still waiting.

Comments

Loading comments...