ADR 0008: Unsupported backends with obfuscated storage¶
Status¶
Accepted.
Context¶
agentgrep discovers and parses local stores written by AI coding agents. Some agents persist their conversations in a form agentgrep cannot read: the bytes are encrypted or use a proprietary, undocumented encoding with no recoverable schema.
Windsurf (Codeium Cascade) is the concrete example. Its per-session
transcripts live at ~/.codeium/windsurf/cascade/<uuid>.pb (and the
sibling implicit/, chat_state/, memories/ .pb files). The
observed payloads are multi-megabyte, high-entropy, not gzip/zlib, and
yield no extractable UTF-8 runs — they are encrypted. A reverse-engineered
.proto would not help without Codeium’s key, so the content is not
recoverable from the files alone.
The catalogue still benefits from recording where such data lives so a storage audit is complete, but agentgrep must not imply it can search content it cannot read.
Decision¶
A backend whose conversation content is obfuscated or encrypted is documented but unsupported:
Its stores are catalogued as
catalog_onlyStoreDescriptorrows — path pattern, role, format, and aschema_notesstring that states the content is unreadable and why. They carry nodiscoveryspec and no adapter, so nothing claims to parse them.The agent stays in
AgentName(so the catalogue rows are well-typed and appear inlist_storesand the storage catalogue) but is excluded fromAGENT_CHOICES, the queryagent:enum, and every search/find surface.--agent <name>andagent:<name>are rejected.User-facing coverage statements (CLI help, MCP instructions, docs lead paragraphs) do not list the agent as covered. The MCP instructions name it explicitly as catalogued-but-unsupported.
Documentation lives under a dedicated “Unsupported backends” section (
docs/backends/unsupported/) that explains the obfuscation and the documentary-only treatment.
Readable sidecar artifacts (e.g. Markdown plans or rules) that happen to
live alongside the encrypted transcripts are catalogued by location for
inventory completeness but are also catalog_only while the agent as a
whole is unsupported, so coverage is described consistently.
Consequences¶
The catalogue and list_stores remain complete — a storage audit sees
the encrypted stores and learns they exist — without agentgrep ever
returning empty or misleading “results” for them. If an agent’s format is
later opened up (a published schema, or unencrypted data), promoting it to
a supported backend is additive: add a discovery spec, an adapter, and
the agent to AGENT_CHOICES and the search surfaces.
This ADR governs classification only. It does not authorise decrypting or working around an agent’s protection; agentgrep reads what the user can already read on disk.