# agentgrep find Source: https://agentgrep.org/cli/find/ (cli-find)= # agentgrep find The `agentgrep find` command enumerates the on-disk prompt and history stores agentgrep can read — Codex session files, Claude Code JSONL transcripts, Cursor SQLite databases, Gemini history. Use it to inspect what agentgrep sees before running a search, or to feed a catalog into another tool. The flag grammar mirrors `fd`: the positional PATTERN is treated as a regex by default, with `-F` (literal), `-g` (glob), and `--exact` modifiers; `-t` filters by record kind; `-e` filters by file extension; `-l` switches to a long-format output; `-0` separates output with NUL for `xargs -0` consumers. `-g` matches against the file basename by default; `--full-path` opts into matching the absolute path (fd's `-p`). The default output is **one path per line** — the fd-faithful shape. Use `-l/--list-details` to add metadata (agent, kind, store, adapter_id) as tab-separated columns. ## Examples List every store agentgrep can read (no positional pattern needed — fd parity): ```console $ agentgrep find ``` Narrow to one agent: ```console $ agentgrep find codex ``` Filter by literal substring (the legacy default before fd alignment): ```console $ agentgrep find -F sessions ``` Glob the file basename: ```console $ agentgrep find -g '*.jsonl' ``` Glob the absolute path (fd's `-p` parity): ```console $ agentgrep find -g '*/sessions/*.jsonl' --full-path ``` Restrict to one record kind and one file extension: ```console $ agentgrep find -t prompts -e jsonl ``` Long format for column-aware downstream tools: ```console $ agentgrep find -l ``` NUL-separated output for `xargs -0`: ```console $ agentgrep find -0 | xargs -0 -n1 ls -l ``` Open the Textual explorer pre-filled with the find query: ```console $ agentgrep find -t prompts --ui ``` Silence the source-discovery spinner: ```console $ agentgrep find --no-progress codex ``` ## Live streaming `find` consumes the {ref}`library-event-stream` directly — text, NDJSON, and `--print0` output emit each path as the engine discovers it, with stdout flushed when stdout is a TTY. `--json` and `--list-details` buffer because their output shape benefits from the full record list up front. ## Command ```{eval-rst} .. argparse:: :module: agentgrep :func: build_docs_parser :prog: agentgrep :path: find :nodescription: ``` (cli-find-json-output)= ## JSON output Pass `--json` to emit a single JSON document containing every discovered store: ```console $ agentgrep find --json ``` The envelope carries a list of {class}`~agentgrep.FindRecord` entries. Each record carries an `agent` tag, the absolute path, the store kind, and discovery metadata. `--json` is the right mode when the caller wants to parse the entire catalog at once — for example a wrapping agent that decides which stores to read before issuing a `search` call. ## NDJSON output Pass `--ndjson` to stream one JSON object per line: ```console $ agentgrep find --ndjson | jq -r '.path' ``` Each line is a single {class}`~agentgrep.FindRecord`. Use this mode when piping into `jq`, into another CLI, or into a non-MCP agent that consumes the catalog incrementally. ## Filtering by agent `--agent` is repeatable and limits discovery to specific backends: ```console $ agentgrep find --agent claude --agent codex ``` Pass `--agent all` (or omit the flag) to enumerate every available backend. ## Query language `find` accepts the same Lucene-style field syntax as the other subcommands. Source-level field predicates (`agent:`, `path:`, `store:`, `mtime:`) prune sources before they're emitted: ```console $ agentgrep find agent:codex ``` ```console $ agentgrep find 'path:~/.codex agent:codex' ``` ```console $ agentgrep find 'mtime:>2026-01-01' ``` Record-level fields (`type:`, `timestamp:`, `model:`, `role:`) are accepted by the parser but don't filter find output since find emits one record per source. Use `agentgrep grep` if you need record-level filtering. See {ref}`library-query-language` for the full grammar. --- # agentgrep fuzzy Source: https://agentgrep.org/cli/fuzzy/ (cli-fuzzy)= # agentgrep fuzzy The `agentgrep fuzzy` command is a non-interactive fuzzy filter shaped like `fzf --filter`. It reads candidate lines from stdin, scores them against your query, and emits the matches in descending score order. Use it as the narrowing stage at the tail of a pipeline. Defaults follow fzf: fuzzy matching with the v2 algorithm, smart-case (case-insensitive unless the query has uppercase), extended-search syntax (`foo !bar`, `^foo`, `bar$`, `'foo`), and score-descending sort. ## Examples Narrow grep output to lines that fuzzy-match a phrase: ```console $ agentgrep grep -F . | agentgrep fuzzy 'config bliss' ``` Exact substring matching instead of fuzzy: ```console $ agentgrep fuzzy --exact -i 'design notes' < transcript.txt ``` Match in a specific tab-separated column (1-indexed): ```console $ agentgrep find -l | agentgrep fuzzy --delimiter $'\t' --nth 4 jsonl ``` Print the query as the first output line (fzf's `--print-query`): ```console $ agentgrep fuzzy --print-query design < transcript.txt ``` Open the Textual explorer pre-filled with the fuzzy query: ```console $ agentgrep fuzzy design --ui ``` ## No-input behavior When `agentgrep fuzzy` is invoked with no QUERY positional, no `-f` filter, AND no piped stdin (i.e. you typed `agentgrep fuzzy` into an interactive shell), the subcommand prints its usage and exits with status `2`. There is no interactive fzf-style TUI fallback — for interactive browsing reach for `agentgrep ui` or the `--ui` overlay on any other subcommand. ## Command ```{eval-rst} .. argparse:: :module: agentgrep :func: build_docs_parser :prog: agentgrep :path: fuzzy :nodescription: ``` ## Exit codes - `0` — at least one input line matched the filter - `1` — no lines matched - `2` — invalid usage (no QUERY and no piped stdin) ## Extended-search syntax The default `--no-extended` flag toggles fzf's extended-search grammar: - A bare token must appear in the line (substring match) - `!token` excludes lines containing `token` - `^token` anchors to the line prefix - `token$` anchors to the line suffix - `'token` forces an exact substring match (no fuzzy fallback) Tokens are whitespace-separated. A line matches when every positive token's predicate is satisfied and no negative token's predicate is. Pass `--no-extended` to treat the query as a single literal pattern. ## Field selection `--delimiter`, `--nth`, and `--with-nth` mirror fzf's field-selection model. With `--delimiter $'\t'` and `--nth 2`, only the second tab-separated field is scored. With `--with-nth 3`, only the third field is displayed even though scoring happens against the full line. Together they let you fuzzy-narrow tabular pipelines (e.g. agentgrep find `--list-details`) by column. ## NUL-delimited I/O Pass `--read0` to treat stdin as NUL-delimited (for input from `agentgrep find --print0`, `xargs -0`, etc.). Pass `--print0` to separate output records with NUL — useful when piping back into another `xargs -0` consumer or any tool that needs paths with embedded spaces. ## Interactive UI Pass `--ui` to open the {ref}`Textual explorer ` pre-filled with the fuzzy query. --- # agentgrep grep Source: https://agentgrep.org/cli/grep/ (cli-grep)= # agentgrep grep The `agentgrep grep` command searches normalized prompt and history records with the flag grammar and output behavior of `ripgrep` and `the_silver_searcher`. If you already reach for `rg -i` or `ag -F` without thinking, the same flags work here against your AI history. Defaults follow rg: smart-case (case-insensitive unless the pattern contains uppercase), regex pattern interpretation, color on TTY, and line-aware output. Each matching record emits one row per matching line. By default each row is just `path:text` (rg's default pipe shape); pass `-n` / `--line-number` to add line numbers, `--column` to add column numbers (implies `-n`), and `--vimgrep` for the `path:line:col:text` shape with one row per match span. On TTY a per-record heading line opens with agent · timestamp · path. The one deliberate divergence is session deduplication — see {ref}`cli-grep-dedupe` below. ## Examples A literal single-pattern search across every agent: ```console $ agentgrep grep bliss ``` Force case-insensitive matching: ```console $ agentgrep grep -i 'serene bliss' ``` Treat the pattern as a literal substring (not a regex): ```console $ agentgrep grep -F --type history 'v1.2.3' ``` Stream an rg-style event stream as JSON: ```console $ agentgrep grep --json design ``` Drop session dedup for the raw rg-faithful view: ```console $ agentgrep grep --no-dedupe foo ``` Open the Textual explorer pre-filled with the grep query: ```console $ agentgrep grep -i foo --ui ``` Silence the stderr spinner: ```console $ agentgrep grep --no-progress bliss ``` ## Output format By default `grep` emits one stdout line per matching line within a record, with the matched substring highlighted. The shape mirrors `rg`: - **On TTY** (heading mode, default): a per-record heading line carries `agent · timestamp · path`, then each matching line follows as `text` (or `line:text` with `-n`, or `line:col:text` with `--column`). Records are separated by a blank line. Toggle off with `--no-heading`. - **On pipe** (flat mode, default when stdout isn't a TTY): every match emits as `path:text` (rg's default), so `agentgrep grep foo | jq` or `... | awk` see one line per match. `-n` adds `:line:` after the path; `--column` adds `:col:` (implies `-n`). Toggle the heading on with `--heading`. The `--vimgrep` flag forces flat mode and emits `path:line:col:text` with one row per match span (rather than one per match line), so a line with two hits produces two rows — useful for `vim` `:cfile` and other editors that consume the `file:line:col:message` format. `--only-matching` / `-o` collapses output to just the matched substrings, one per line — the per-record heading separator is suppressed under `-o`, so the stream is exactly bare matches back-to-back (`rg -o` parity). `-l` / `--files-with-matches` emits only the deduplicated paths. `-c` emits `path:N` per matching record with the count of matching lines (or just `N` when exactly one record matched), matching `rg -c`. ## Live streaming `grep` consumes the {ref}`library-event-stream` directly — text and NDJSON output emit each match as the engine finds it, then flush so your terminal sees rows live. On a slow store the first matches appear within milliseconds, not after the whole scan finishes. The eager output modes (`--json`, `-c`, `-l`, `-L`, `-v`) buffer because their output shape needs the final tally or cross-record deduplication. ## Progress The stderr progress spinner (when stderr is a TTY) lets you know a search is still running on slow stores. Silence it with `--no-progress` or the equivalent `--progress=never`: ```console $ agentgrep grep --no-progress bliss ``` Progress always writes to stderr, so it never collides with stdout output — `agentgrep grep foo | less` won't see the spinner in the piped buffer. ## Command ```{eval-rst} .. argparse:: :module: agentgrep :func: build_docs_parser :prog: agentgrep :path: grep :nodescription: ``` ## Exit codes `agentgrep grep` follows grep's conventions: - `0` — at least one matching record was found - `1` — no matches - `2` — error during search (invalid regex, unreadable store, …) Use these in shell scripts the same way you'd use `rg`'s exit codes. (cli-grep-error-handling)= ## Error handling Invalid regex patterns are caught at the argparse layer and surfaced with the standard argparse error shape, then exit 2: ```console $ agentgrep grep '[' usage: agentgrep grep [...] agentgrep grep: error: invalid regex '[': unterminated character set at position 0 ``` The check runs before the engine starts so a malformed pattern never emits partial output and never escapes as a Python traceback. `-F` (fixed-strings) skips the check — its patterns are literal substrings, not regex. Empty patterns are also rejected at parse time (git-grep parity): ```console $ agentgrep grep '' usage: agentgrep grep [...] agentgrep grep: error: pattern cannot be empty ``` The check applies to every term — a valid pattern followed by an empty one (`agentgrep grep foo ''`) still fails. `-v` / `--invert-match` for plain text output is not yet implemented and is refused at parse time: ```console $ agentgrep grep -v bliss usage: agentgrep grep [...] agentgrep grep: error: --invert-match for text output is not yet implemented (see https://github.com/tony/agentgrep/issues/8); use -c or -L ``` The flag is still honored under `-c` (returns `0` if any record matched, `1` if none) and `-L` (lists sources with no matches), since both reduce to a "did anything match?" question that the engine's current output supports. Tracking issue: [tony/agentgrep#8](https://github.com/tony/agentgrep/issues/8). ## Files without matches `-L` / `--files-without-match` lists every planned source whose records produced no matches — the rg-style "what did I miss?" view: ```console $ agentgrep grep -L bliss ~/.codex/sessions/2026/05/b.jsonl ``` Exit code follows `rg`: `0` when at least one path is printed (the listed paths are themselves the "match" for `-L`), `1` when every planned source contains a match. (cli-grep-dedupe)= ## Session deduplication By default `grep` deduplicates matches by session so a single conversation that repeats near-identical text doesn't drown the output. This is the one place where `agentgrep grep` deliberately diverges from `rg`'s raw behavior — AI history stores often replay the same message text many times across one session, which makes the raw rg view noisier than a filesystem grep. Pass `--no-dedupe` to disable the per-session dedup and get every matching record back, exactly matching rg's "every line is its own match" convention: ```console $ agentgrep grep --no-dedupe foo ``` ## JSON output Pass `--json` to emit an rg-shaped per-line event stream: ```console $ agentgrep grep --json deploy ``` The output is a JSON document whose `events` array carries one `begin` event opening each matching record, one `match` event per matching line within that record, an `end` event closing the record, and a final `summary` event with the total match count. Each `match` event mirrors `rg`'s per-line shape: ```json {"type":"match","data":{ "path":{"text":"~/.codex/.../sample.jsonl"}, "line_number":1, "lines":{"text":"The bliss primitive ships with serene defaults"}, "submatches":[{"match":{"text":"bliss"},"start":4,"end":9}]}} ``` `submatches` carries byte offsets within the line so consumers can slice the matched substring directly. Tools written against `rg`'s JSON contract can consume agentgrep's stream with the same parser. ## NDJSON output Pass `--ndjson` for one event per line: ```console $ agentgrep grep --ndjson foo | jq 'select(.type == "match") | .data.lines.text' ``` This mode is the right pick when piping into another CLI, into `jq`, or into a non-MCP agent that consumes results incrementally. ## Interactive UI Pass `--ui` to open the {ref}`Textual explorer ` pre-filled with the grep query. ## Query language `grep` accepts the same Lucene-style field syntax `search` does — mix field predicates with text patterns inline: ```console $ agentgrep grep agent:codex bliss ``` ```console $ agentgrep grep '(agent:codex OR agent:cursor) AND deploy' ``` The text portion (`bliss`, `deploy`) feeds grep's existing line- aware matching; the field predicates (`agent:`, `path:`, `timestamp:`) prune sources and filter records around it. A query with only field predicates and no text errors out — `grep` needs text to match lines against, so steer to `agentgrep find` for source-level enumeration. See {ref}`library-query-language` for the full grammar. --- # CLI Source: https://agentgrep.org/cli/ (cli)= (cli-index)= # CLI The `agentgrep` CLI is the fastest path to your local AI agent prompt and history archives from a terminal. It wraps the same read-only discovery and parsing layer the MCP server exposes — grep, find stores, filter by agent — and lets you pipe everything through `--json` or `--ndjson` so any script or non-MCP agent can consume the results. Bare `agentgrep` (no subcommand) prints a colorized directory of choices listing every subcommand with example invocations — the same `tmuxp` / `vcspull` pattern. See {ref}`tui` for the interactive Textual explorer. ```{note} Versions before 0.1.0a5 silently rewrote `agentgrep ` as `agentgrep grep ` and bare `agentgrep` as `agentgrep ui`. Both shortcuts are gone — every subcommand must be named explicitly. `agentgrep bliss` is now an `invalid choice` error rather than a grep; reach for `agentgrep grep bliss`. ``` ```{cli-install} ``` ::::{grid} 1 2 2 3 :gutter: 2 2 3 3 :::{grid-item-card} agentgrep grep :link: grep :link-type: doc Content search with rg/ag-shaped flags, output, and exit codes. ::: :::{grid-item-card} agentgrep find :link: find :link-type: doc Enumerate on-disk stores with fd-shaped flag grammar. ::: :::{grid-item-card} agentgrep fuzzy :link: fuzzy :link-type: doc Non-interactive fuzzy match on stdin, shaped like `fzf --filter`. ::: :::{grid-item-card} API Reference :link: reference :link-type: doc CLI argument types, serialization helpers, and command entry points. ::: :::: ## --ui overlay Every search-shaped subcommand accepts `--ui`: pass it to open the {ref}`Textual explorer ` pre-filled with the same query you'd otherwise run as a one-shot. This is the `tig`-shaped overlay model — `agentgrep grep -i foo --ui` is to `agentgrep grep -i foo` what `tig log` is to `git log`. ## Use from another agent The CLI is a first-class consumer for any agent that doesn't speak MCP. Two flags govern machine-readable output: - `--json` emits a single JSON document with an `envelope` carrying the record list. Best when the caller wants to parse the whole result at once. - `--ndjson` streams one JSON object per line. Best for piping into `jq`, into another CLI, or into an agent that consumes results incrementally. Both flags work on `grep` and `find`. See [](#cli-find-json-output) for the record shapes. Agents that already speak MCP should prefer [`agentgrep-mcp`](../mcp/index.md) — same discovery and parsing surface, but exposed as MCP tools with typed schemas. ## Examples Search prompts with rg-shaped flags: ```console $ agentgrep grep bliss ``` Combine multiple patterns with an agent filter: ```console $ agentgrep grep serene bliss --agent codex ``` Stream history matches as NDJSON: ```console $ agentgrep grep prompt history --type history --ndjson ``` List stores for one agent as JSON: ```console $ agentgrep find cursor --json ``` Open the directory of choices: ```console $ agentgrep ``` ## Command: `agentgrep` ```{eval-rst} .. argparse:: :module: agentgrep :func: build_docs_parser :prog: agentgrep :nosubcommands: :nodescription: ``` ```{toctree} :hidden: grep find fuzzy reference ``` --- # API Reference Source: https://agentgrep.org/cli/reference/ (cli-reference)= # API Reference CLI argument types, serialization helpers, and command entry points. ## Argument types ```{eval-rst} .. autoclass:: agentgrep.FindArgs :members: ``` ## Serialization ```{eval-rst} .. autofunction:: agentgrep.serialize_search_record .. autofunction:: agentgrep.serialize_find_record .. autofunction:: agentgrep.serialize_source_handle .. autofunction:: agentgrep.build_envelope ``` ## Entry points ```{eval-rst} .. autofunction:: agentgrep.run_find_command .. autofunction:: agentgrep.run_ui_command .. autofunction:: agentgrep.main ``` --- # Benchmark harness Source: https://agentgrep.org/dev/benchmark/ (benchmark)= # Benchmark harness `scripts/benchmark.py` is a cross-commit benchmark runner that walks hyperfine across HEAD, trunk, a range, the last N commits, an explicit list of tags / SHAs, or just HEAD vs. trunk. It surfaces uniform cold-start regressions that are easy to miss by eyeballing — promoting the one-off shell-script prototype to a first-class repo tool turns "where did `grep` slow down?" into a one-liner. The harness is a **development-only tool** — it isn't shipped with the agentgrep wheel and isn't installed by `pip install agentgrep`. It lives in `scripts/` alongside `scripts/mcp_swap.py` and runs straight from a repo checkout via `uv run`. The script is a **PEP 723 self-contained file** — its third-party deps (typer / rich / pydantic) resolve through `uv run`'s transient venv, so the harness is portable across any uv-managed project. Drop in your own `scripts/benchmark.toml` and it works. ## Invocation recipes Every subcommand is reachable via `uv run scripts/benchmark.py`: ```console $ uv run scripts/benchmark.py run --target HEAD ``` Single-commit bench against the configured `[bench.*]` entries. ```console $ uv run scripts/benchmark.py run --target trunk ``` Single-commit bench against the trunk ref (default `master`, override in `[settings].trunk`). ```console $ uv run scripts/benchmark.py run --head-vs-trunk ``` HEAD versus trunk — the most common "did my branch regress?" shape. ```console $ uv run scripts/benchmark.py run --range master..HEAD ``` Walk every commit on the current branch since it diverged from trunk. ```console $ uv run scripts/benchmark.py run --lookback 10 ``` The last 10 commits from HEAD, oldest first. ```console $ uv run scripts/benchmark.py run --from-trunk-back 5 ``` Trunk and the five commits before it — useful for establishing a baseline that pre-dates a refactor. ```console $ uv run scripts/benchmark.py run --tags ``` Every git tag, sorted by `--sort=v:refname`. ```console $ uv run scripts/benchmark.py run --commits abc1234,def5678 ``` Explicit list of refs (SHAs, tags, or branches). ```console $ uv run scripts/benchmark.py compare HEAD master ``` Sugar for `run --commits HEAD,master`. ## Discovery subcommands ```console $ uv run scripts/benchmark.py list-commits --lookback 10 ``` Print the commits a selector would resolve to, without running any benchmarks — handy for sanity-checking `--range` and `--lookback` arguments before committing to a long run. ```console $ uv run scripts/benchmark.py list-commands ``` Print every configured `[bench.X]` entry after applying the TOML layers — useful when iterating on `benchmark.local.toml`. ```console $ uv run scripts/benchmark.py show-config ``` Dump the post-layering config as JSON, the closest thing to a "what will actually run" inspection. ## Output formats `--format` selects the renderer; the default is a rich terminal table: | Format | Use case | |---|---| | `rich` | Interactive terminal viewing (default) | | `json` | Single JSON document with raw samples preserved | | `ndjson` | One JSON object per line — pipe-friendly for `jq` | | `md` | Markdown tables (mirrors the prototype `performance.md`) | | `csv` | Flat row-per-measurement; raw samples joined by `;` | ```console $ uv run scripts/benchmark.py run --lookback 50 --format md --output performance.md ``` `--show-percentiles` picks the subset of stat labels rendered in the visible cells: ```console $ uv run scripts/benchmark.py run --target HEAD --show-percentiles min,avg,p90,max ``` Accepted labels: `min`, `max`, `avg`, and any `p` (e.g. `p50`, `p90`, `p95`, `p99`). ## Config layering Four layers compose into the effective config, in this precedence order (each layer overlays the previous): 1. **Built-in pydantic defaults** — `runs = 3`, `warmup = 1`, `trunk = "master"`, etc. 2. **`scripts/benchmark.toml`** — the committed defaults shipped with the repo. Defines `[bench.grep]`, `[bench.find]`, `[bench.search]`, `[bench.import-time]` for agentgrep against `libtmux`. 3. **`scripts/benchmark.local.toml`** — per-machine overrides (gitignored). Copy `scripts/benchmark.local.toml.example` to start. 4. **CLI flags** — `--runs N`, `--warmup N`, `--query STR`, `--commands grep,find` always trump the TOML. Deep-merge semantics: only the keys you set in a higher layer are replaced. So adding `[bench.fuzzy]` in `benchmark.local.toml` extends the bench set without disturbing the existing entries. ## Templating Command strings in `[bench.X].command` support these placeholders: | Token | Value | |---|---| | `{venv}` | `[settings].venv` (default `.venv`) | | `{query}` | per-bench `default_query`, or `--query STR` | | `{sha}` | full commit SHA being benchmarked | | `{short_sha}` | first seven chars | | `{repo}` | repo root absolute path | Unknown placeholders raise — the harness fails loud rather than emit a broken command. ## Safety - **Dirty-tree refusal.** The script aborts on uncommitted changes unless `--allow-dirty` is passed. Per-commit benchmarks shred the worktree via `git checkout`; running them on top of uncommitted work would silently roll your edits forward into the wrong commit. - **HEAD restore on exit.** An `atexit` hook plus SIGINT / SIGTERM trap restores the original ref (branch name or HEAD SHA) on every exit path. Pass `--keep-checkout` to disable the trap when you want to iterate on a checked-out commit after the run. - **Per-commit isolation.** Each iteration runs `git checkout` + (by default) `uv sync` to rebuild `.venv` against the current source. `--no-sync` skips the sync if the wheel happens to be ABI-compatible across the range you're testing. ## When to skip the harness The harness is a developer tool, not a CI gate. It expects: - A clean worktree (or `--allow-dirty`). - A repo where every commit in the range builds — failed `uv sync` marks the row `sync_fail` and moves on, but a wide swath of unbuildable history makes the output less useful. - Hyperfine on PATH for the default timing path. The pure-Python fallback (`--no-hyperfine`, or simply not having hyperfine installed) is portable but produces noisier samples. --- # Development Source: https://agentgrep.org/dev/ (development)= # Development Contributing and internals references — material for people poking at agentgrep's source, writing new adapters, or running benchmarks across the commit history. None of this ships in the published wheel. ::::{grid} 1 1 2 2 :gutter: 2 :::{grid-item-card} Benchmark harness :link: benchmark :link-type: doc Cross-commit `hyperfine` sweeps across HEAD, trunk, ranges, lookback, tags, or explicit commit lists. PEP 723 self-contained script in `scripts/benchmark.py`. ::: :::{grid-item-card} Storage catalogue :link: storage-catalog :link-type: doc On-disk store layouts for Codex, Claude Code, Cursor, and Gemini CLI — useful for adapter authors and anyone tracing why a record was or wasn't found. ::: :::: ```{toctree} :hidden: benchmark storage-catalog ``` --- # Storage catalogue Source: https://agentgrep.org/dev/storage-catalog/ (storage-catalog)= # Storage catalogue agentgrep keeps an explicit catalogue of every on-disk store it knows about, modelled as Pydantic {class}`~agentgrep.stores.StoreDescriptor` rows aggregated under one {class}`~agentgrep.stores.StoreCatalog`. The catalogue is **descriptive**: it documents *where* each agent's data lives and *what* the records look like. Search-policy decisions — whether agentgrep actually opens a particular store by default — are captured per-row and may be deferred (`search_by_default=None`) when no adapter consumes them yet. The catalogue is the single source of truth that downstream adapters consume. When upstream renames a path or changes a record shape, the fix is to update one {class}`~agentgrep.stores.StoreDescriptor` and bump the catalogue version; adapters pick the new metadata up automatically. ```{contents} :local: :depth: 2 ``` ## Why a catalogue Three reasons we did not bake paths into the adapters: 1. **Provenance.** Each row carries an `observed_version` and `observed_at` stamp. A reader can tell at a glance whether the schema notes are still current or stale. 2. **Drift.** Codex renames `history.jsonl`, Cursor adds a CLI agent layout, Gemini reorganises its `tmp/` tree. With paths catalogued centrally, those changes diff cleanly in code review. 3. **Overlap.** Several stores live in adjacent paths but play different roles — Codex `history.jsonl` (user prompts only) vs. `sessions/*.jsonl` (full per-thread transcripts); Gemini `tmp//chats/` (live) vs. `history//` (post-retention archive). The `distinguishes_from` field on each descriptor names the sibling and explains the difference. ## Reading a descriptor ```python from agentgrep.store_catalog import CATALOG claude_session = CATALOG.by_id("claude.projects.session") claude_session.path_pattern # '${HOME}/.claude/projects//.jsonl' ``` Path patterns use `${HOME}` and `${}` tokens; resolving them against a concrete environment is the consumer's job, so the catalogue stays portable. `env_overrides` lists the env vars that change the root (Codex respects `CODEX_HOME`; Gemini respects `GEMINI_CLI_HOME`). ## Stores by agent ### Claude Code `observed_version`: ``claude-code v2.1.143`` (2026-05-15). Claude's primary chat record lives at `${HOME}/.claude/projects//.jsonl`. The file format is JSONL with multiple record types per line — `type: "user"`, `type: "assistant"`, `type: "attachment"`, `type: "permission-mode"`. Sub-agent dispatches nest under `/subagents/`. The auto-memory feature stores markdown notes under `/memory/`. ### Cursor Two distinct surfaces, both catalogued and both searched: - **Cursor CLI agent** (`cursor-agent`): transcripts live at `${HOME}/.cursor/projects//agent-transcripts//.jsonl` and are parsed by `cursor.cli_jsonl.v1`. Records are Anthropic-style `{role, message.content[]}` with `text` and `tool_use` content blocks; tool outputs are sometimes `[REDACTED]` in older `cursor-agent` builds. There is no native per-turn timestamp, so agentgrep backfills the file's mtime. - **Cursor IDE**: parsed by `cursor.state_vscdb_modern.v1` / `cursor.state_vscdb_legacy.v1` via `state.vscdb` (SQLite). The catalogue keeps the IDE path separate from the CLI agent so the two never collide. `cursor.cli.worktrees` is catalogued explicitly with `role=SOURCE_TREE` and `search_by_default=False` so the adapter does not index multi-gigabyte git working trees as chat history. ### Codex `observed_version`: ``github.com/openai/codex@4c89772`` (2026-05-16). Schemas are pinned directly to the upstream Rust types: - {attr}`~agentgrep.stores.StoreFormat.JSONL` `history.jsonl` → `HistoryEntry { session_id: String, ts: u64, text: String }` ([`codex-rs/message-history/src/lib.rs:54-58`](https://github.com/openai/codex/blob/4c89772/codex-rs/message-history/src/lib.rs#L54)). - Per-thread `sessions/YYYY/MM/DD/rollout-…jsonl` → tagged enum `RolloutItem` with variants `SessionMeta`, `ResponseItem`, `Compacted`, `TurnContext`, `EventMsg` ([`codex-rs/protocol/src/protocol.rs:2783`](https://github.com/openai/codex/blob/4c89772/codex-rs/protocol/src/protocol.rs#L2783)). The two `_N.sqlite` files at the Codex root — `state_5.sqlite` and `logs_2.sqlite` — belong to the Codex CLI. Their filenames come from `STATE_DB_FILENAME` and `LOGS_DB_FILENAME` in [`codex-rs/state/src/lib.rs`](https://github.com/openai/codex/blob/4c89772/codex-rs/state/src/lib.rs#L70-L71). ### Gemini CLI `observed_version`: ``gemini-cli v0.42.0`` stable (2026-05-12); types from `v0.44.0-nightly` HEAD `77e65c0d`. Three adapters cover the three on-disk shapes: - `gemini.tmp_chats_jsonl.v1` parses `tmp//chats/session-*.jsonl`. Each file opens with a `SessionMetadataRecord` (`sessionId`, `projectHash`, `startTime`, `lastUpdated`, `kind`); subsequent lines are `MessageRecord` turns interleaved with `MetadataUpdateRecord` updates (`{$set: {…}}`). Real files surface `type` values `user` and `gemini`; upstream types also declare `info`/`error`/`warning` plus `RewindRecord` and `PartialMetadataRecord`, but those records did not appear in sampling. `gemini`-typed turns whose `content` is empty have their searchable text drawn from `thoughts[*].subject`/ `description` and `toolCalls[*].name`/`description`, joined into one record per turn. - `gemini.tmp_chats_legacy_json.v1` parses pre-Feb 2026 `tmp//chats/session-*.json` single-file sessions. Upstream still reads this shape via the `isLegacyRecord` discriminator at [`chatRecordingService.ts:941`](https://github.com/google-gemini/gemini-cli/blob/77e65c0d/packages/core/src/services/chatRecordingService.ts#L941); the legacy file holds session metadata at the top level and the full conversation under a `messages` array. - `gemini.tmp_logs_json.v1` parses `tmp//logs.json` — a flat JSON array of `LogEntry` records (user-prompt audit log). Gemini's [`sessionCleanup.ts`](https://github.com/google-gemini/gemini-cli/blob/77e65c0d/packages/cli/src/utils/sessionCleanup.ts) hard-deletes expired sessions via `fs.unlink()` — there is no `history/` archive. The Antigravity files some installs carry under `~/.gemini/antigravity/conversations/` are written by the [Antigravity IDE](https://github.com/google-gemini/gemini-cli/blob/77e65c0d/packages/core/src/ide/detect-ide.ts), a separate Google product — Gemini CLI only detects Antigravity as an IDE launcher and does not read or write the protobuf conversation files. Both stores are out of scope for the Gemini adapters. The `project_hash` is `sha256(absolute_project_root)`. agentgrep exposes a Python mirror via {func}`~agentgrep.store_catalog.gemini_project_hash` so the CLI can answer "which Gemini sessions belong to *this* repo?". ## Adding or updating a store 1. Edit `src/agentgrep/store_catalog.py`. Stamp `observed_version` and `observed_at` against the version you actually inspected. 2. Add an `upstream_ref` (preferred) or a `sample_record` so future readers can verify the schema. 3. If the new store overlaps a sibling, name it in `distinguishes_from` and explain the difference in `schema_notes`. 4. Capture a redacted fixture under `tests/samples///`. 5. Bump `catalog_version` in the same commit that changes descriptor shape. 6. Run `uv run pytest tests/test_stores.py`. ## See also - {mod}`agentgrep.stores` — model definitions - {mod}`agentgrep.store_catalog` — concrete registry --- # MCP Clients Source: https://agentgrep.org/getting-started/clients/ (clients)= # MCP Clients agentgrep exposes a local stdio MCP server. Any MCP client that can launch a command can run it. ## Codex CLI From a development checkout: ```toml [mcp_servers.agentgrep] command = "uv" args = ["run", "agentgrep-mcp"] cwd = "/path/to/agentgrep" ``` For an installed package: ```toml [mcp_servers.agentgrep] command = "agentgrep-mcp" ``` ## Claude Code From a development checkout: ```console $ claude mcp add agentgrep --cwd /path/to/agentgrep -- uv run agentgrep-mcp ``` For an installed package: ```console $ claude mcp add agentgrep -- agentgrep-mcp ``` ## Claude Desktop and Cursor Use a JSON `mcpServers` entry: ```json { "mcpServers": { "agentgrep": { "command": "uv", "args": ["run", "agentgrep-mcp"], "cwd": "/path/to/agentgrep" } } } ``` For an installed package: ```json { "mcpServers": { "agentgrep": { "command": "agentgrep-mcp" } } } ``` ## FastMCP The repository includes `fastmcp.json`: ```console $ uv run fastmcp run fastmcp.json ``` Inspect the server surface: ```console $ uv run fastmcp inspect fastmcp.json ``` --- # Configuration Source: https://agentgrep.org/getting-started/configuration/ (configuration)= # Configuration agentgrep is intentionally low-configuration. It reads known local agent stores under the current user's home directory and never mutates them. ## Agent selection Use `--agent` one or more times to limit search or discovery: ```console $ uv run agentgrep grep "cache" --agent codex ``` Supported agents are `codex`, `claude`, `cursor`, and `gemini`. Omitting `--agent` searches all supported agents. ## Search type Use `--type` to choose records: ```console $ uv run agentgrep grep "docs deploy" --type prompts ``` Allowed values are `prompts`, `history`, and `all`. ## Output Text output is optimized for terminal reading: ```console $ uv run agentgrep grep "release" ``` Use JSON or NDJSON for scripts: ```console $ uv run agentgrep grep "release" --json ``` ```console $ uv run agentgrep grep "release" --ndjson ``` ## Progress and early answers Human text searches show progress by default. Press Enter on a blank line to return the matches collected so far. ```console $ uv run agentgrep grep "bliss" --progress always ``` Disable progress when scripting: ```console $ uv run agentgrep grep "bliss" --progress never ``` ## Privacy Serialized paths are protected before leaving the process. Home-relative paths are displayed as `~/...`, and directory paths keep a trailing `/`, for example `~/.codex/sessions/`. ## MCP capabilities MCP clients can read `agentgrep://capabilities` to inspect supported agents, adapters, tools, resources, prompts, and selected optional backends. --- # Getting Started Source: https://agentgrep.org/getting-started/ (getting-started)= (quickstart)= # Getting Started One path from a checkout to a useful search result. ## 1. Install dependencies From the repository root: ```console $ uv sync --all-groups ``` ## 2. Search local agent history Search all supported stores: ```console $ uv run agentgrep grep "release notes" ``` Search one agent's prompt records: ```console $ uv run agentgrep grep "deploy docs" --agent codex --type prompts ``` ## 3. Inspect the stores See which files and databases agentgrep can read: ```console $ uv run agentgrep find ``` Filter discovery output: ```console $ uv run agentgrep find sessions --agent codex ``` ## 4. Use MCP Run the local stdio server: ```console $ uv run agentgrep-mcp ``` Or run the FastMCP config: ```console $ uv run fastmcp run fastmcp.json ``` See {ref}`clients` for MCP client snippets. ## Next steps - {doc}`../library/tutorial` walks through CLI search in more detail. - {doc}`../mcp/tools` documents the MCP tool payloads. - {doc}`configuration` explains output, progress, privacy, and source selection. ```{toctree} :hidden: installation clients configuration ``` --- # Installation Source: https://agentgrep.org/getting-started/installation/ (installation)= # Installation ## Requirements - Python 3.14 - [uv](https://docs.astral.sh/uv/) for the development workflow - Optional command-line backends: `fd`, `rg`, `ag`, and `jq` agentgrep falls back to Python implementations when optional read-only command-line backends are unavailable. ## Development install ```console $ git clone https://github.com/tony/agentgrep.git ``` ```console $ cd agentgrep ``` ```console $ uv sync --all-groups ``` Run the CLI from the checkout: ```console $ uv run agentgrep grep "bliss" ``` Run the MCP server: ```console $ uv run agentgrep-mcp ``` ## Package install When published, install the package into an environment that can read your local agent stores: ```console $ uv pip install agentgrep ``` or: ```console $ pip install agentgrep ``` ## Optional tools agentgrep detects available read-only helpers at runtime: - `fd` for source discovery - `rg` or `ag` for prefiltering text sources - `jq` for JSON string flattening The selected helpers are reported through the MCP capabilities resource. --- # Changelog Source: https://agentgrep.org/history/ (changes)= (changelog)= (history)= ```{include} ../CHANGES ``` --- # agentgrep Source: https://agentgrep.org/ (index)= # agentgrep Read-only search for local AI agent prompts and history across Codex, Claude Code, Cursor, and Gemini. ```{warning} **Pre-alpha.** APIs may change. [Feedback welcome](https://github.com/tony/agentgrep/issues). ``` ```{cli-install} :variant: compact ``` ```{mcp-install} :variant: compact ``` ::::{grid} 1 1 2 3 :gutter: 2 2 3 3 :::{grid-item-card} Quickstart :link: getting-started/index :link-type: doc Run a first search and inspect the result shape. ::: :::{grid-item-card} CLI :link: cli/index :link-type: doc Search and find from the terminal. Pipe `--json` / `--ndjson` for scripts and agents. ::: :::{grid-item-card} TUI :link: tui/index :link-type: doc Interactive Textual explorer for browsing prompts and history. ::: :::{grid-item-card} MCP :link: mcp/index :link-type: doc Tools, resources, and prompts for MCP clients. ::: :::{grid-item-card} Library :link: library/index :link-type: doc Tutorial, how-to, reference, and examples for the Python library. ::: :::{grid-item-card} Client Setup :link: getting-started/clients :link-type: doc Config snippets for local MCP clients. ::: :::{grid-item-card} Configuration :link: getting-started/configuration :link-type: doc Search behavior, privacy, output, and progress controls. ::: :::: ## What you can do ### Prompt Search Find full prompt and history records by literal term or regular expression. search ### Discovery List the stores, session files, and SQLite databases that agentgrep can read. find ### MCP guidance Use prompts for common agent workflows: {ref}`fastmcp-prompt-search-prompts` · {ref}`fastmcp-prompt-search-history` · {ref}`fastmcp-prompt-inspect-stores` ```{toctree} :hidden: getting-started/index cli/index tui/index library/index mcp/index dev/index history GitHub ``` --- # Event-stream engine Source: https://agentgrep.org/library/event-stream/ (library-event-stream)= # Event-stream engine agentgrep's search and find engines produce **typed event streams** — sync generators that yield pydantic discriminated-union events as they walk the user's stores. The same producer feeds the CLI's live output path, the Textual TUI's worker, and the MCP server's response collector. Three frontends, one engine. ## Why a stream A short scan completes before the user notices. A long one — broad patterns, deep history, slow stores — can take seconds. The legacy list-return path (`agentgrep.run_search_query`) buffers every match until the scan finishes, then returns the list. That hides the engine's progress from the consumer and forces a "wait, then dump" UX in the CLI. The event stream solves both: - **Per-record liveness.** Each match emits as {class}`~agentgrep.events.RecordEmitted` the moment the engine decides "unique-and-included." The CLI grep / find text paths consume the stream and print + flush per record; users see the first matches within milliseconds. - **Single source of truth.** Search progress (which source is active, how many records seen / matched) and the matches themselves are the same event stream, not two parallel side channels. - **Decoupling.** The engine doesn't know about stdout, Textual, or fastmcp. It yields events. Consumers translate. ## Architecture ``` ┌────────────────────────────────────────────────────────────────┐ │ PRODUCER (agentgrep._engine) │ │ │ │ def iter_search_events(home, query, *, control=None) │ │ -> Iterator[SearchEvent]: │ │ │ │ yield SearchStarted(source_count=...) │ │ for source in discovered: │ │ yield SourceStarted(adapter_id, index, total) │ │ for record in iter_source_records(source): │ │ if matches(record, query) and unique: │ │ yield RecordEmitted(record=record) │ │ yield SourceFinished(adapter_id, records_seen, ...) │ │ yield SearchFinished(match_count, elapsed_seconds) │ └──────────────────────────┬─────────────────────────────────────┘ │ ┌───────────────────┼───────────────────┐ ▼ ▼ ▼ ┌──────────────┐ ┌──────────────────┐ ┌──────────────────┐ │ CLI (sync) │ │ TUI (Textual) │ │ MCP (sync) │ │ │ │ │ │ │ │ for ev in │ │ @work(thread= │ │ list(records │ │ stream: │ │ True) consumes │ │ for ev in │ │ if Record │ │ via to_thread │ │ stream if │ │ print() │ │ │ │ isinstance │ │ flush() │ │ │ │ RecordEmitted) │ └──────────────┘ └──────────────────┘ └──────────────────┘ ``` ### Sync producer The engine is a synchronous generator. Async consumers wrap it in {func}`asyncio.to_thread` with one line; sync consumers iterate directly. Tests exercise the producer without an event loop, which keeps the test surface small. ### Pydantic events Events are frozen `pydantic.BaseModel` subclasses tagged with a `Literal["..."]` discriminator field. The union types {data}`~agentgrep.events.SearchEvent` and {data}`~agentgrep.events.FindEvent` carry `pydantic.Field(discriminator="type")` so runtime validation routes each payload to the correct variant and `isinstance` narrowing works in consumer loops. Events embed agentgrep's existing {class}`~agentgrep.SearchRecord` / {class}`~agentgrep.FindRecord` dataclasses directly via `arbitrary_types_allowed=True`. Consumers read record attributes without an extra conversion step. Transport- layer consumers (a future HTTP SSE endpoint, for example) should serialise records through {class}`~agentgrep.mcp.models.SearchRecordModel` / {class}`~agentgrep.mcp.models.FindRecordModel` at the boundary so the dataclass-typed field doesn't block `model_dump_json()`. ## Search events The {data}`~agentgrep.events.SearchEvent` union has five members. Their guaranteed sequence: ``` SearchStarted → (SourceStarted → RecordEmitted* → SourceFinished)* → SearchFinished ``` - {class}`~agentgrep.events.SearchStarted` — exactly once at the head. Carries `source_count` (the number of candidate sources after prefiltering). - {class}`~agentgrep.events.SourceStarted` — once per source, in source-discovery order (mtime descending). Carries `adapter_id`, `index`, `total`. - {class}`~agentgrep.events.RecordEmitted` — the hot-path event. Fires only after the per-session dedup decided unique-and-included. - {class}`~agentgrep.events.SourceFinished` — once per source, paired with its `SourceStarted`. Carries `records_seen` (every record parsed) and `matches_seen` (the subset that matched before dedup). - {class}`~agentgrep.events.SearchFinished` — exactly once at the tail. Carries `match_count` (total emitted) and `elapsed_seconds`. Even on empty input the `Started` / `Finished` envelope fires so cleanup code is uniform. ## Find events The {data}`~agentgrep.events.FindEvent` union has three members. Find has no per-source scan loop — each discovered source produces exactly one record — so the sequence simplifies: ``` FindStarted → FindRecordEmitted* → FindFinished ``` - {class}`~agentgrep.events.FindStarted` - {class}`~agentgrep.events.FindRecordEmitted` - {class}`~agentgrep.events.FindFinished` ## Consumer recipes ### Print records as they arrive (the CLI pattern) ```python import sys import agentgrep from agentgrep import events def stream_to_stdout(home, query) -> int: is_tty = sys.stdout.isatty() count = 0 for event in agentgrep.iter_search_events(home, query): if isinstance(event, events.RecordEmitted): print(event.record.text) if is_tty: sys.stdout.flush() count += 1 return 0 if count > 0 else 1 ``` ### Collect to a list (the MCP / TUI snapshot pattern) ```python import agentgrep from agentgrep import events def collect_records(home, query): return [ event.record for event in agentgrep.iter_search_events(home, query) if isinstance(event, events.RecordEmitted) ] ``` ### Update a UI as events arrive (the Textual TUI pattern) ```python import asyncio import agentgrep from agentgrep import events async def update_ui(home, query, render_record): def _drain() -> list[events.SearchEvent]: return list(agentgrep.iter_search_events(home, query)) for event in await asyncio.to_thread(_drain): if isinstance(event, events.RecordEmitted): render_record(event.record) ``` For finer-grained live updates inside Textual, run the generator on a `@work(thread=True)`-decorated method and post a message per event rather than draining first. ### Cancel mid-scan Pass a {class}`~agentgrep.SearchControl` and flip its `request_answer_now()` flag to break out at the next per-record boundary: ```python control = agentgrep.SearchControl() # … on a keypress / timeout / user action: control.request_answer_now() ``` The generator still emits `SearchFinished` so cleanup runs. ## Slice boundaries This page documents Slice 1 — the sync iterator surface used by the CLI's live streaming. Two follow-up slices are planned: - **Slice 2**: an `aiter_search_events` async wrapper that bridges the sync producer via a bounded `asyncio.Queue` and a thread- backed producer task. Cancellation propagates through `CancelledError`. The TUI moves to the async surface; the CLI keeps using the sync iterator. - **Slice 3**: source-level parallelism via `asyncio.TaskGroup` over `asyncio.to_thread(parse_source, src)`. Each source's events merge into a single output stream via the queue. Cancellation propagates through task cancel. Both slices preserve the public event surface — consumers written today continue to work without changes. ## Reference The events module's full API is documented at {mod}`agentgrep.events`. The iterators are at {func}`agentgrep.iter_search_events` and {func}`agentgrep.iter_find_events`. --- # Examples Source: https://agentgrep.org/library/examples/ (package-agentgrep-examples)= # Examples ## CLI search ```console $ uv run agentgrep grep "database migration" --agent codex --limit 10 ``` ## CLI discovery ```console $ uv run agentgrep find cursor --agent cursor --json ``` ## MCP search call ```json { "tool": "search", "arguments": { "terms": ["database migration"], "agent": "codex", "search_type": "prompts", "limit": 10 } } ``` ## MCP discovery call ```json { "tool": "find", "arguments": { "pattern": "sessions", "agent": "codex", "limit": 50 } } ``` ## MCP resource reads ```text agentgrep://capabilities agentgrep://sources agentgrep://sources/codex ``` --- # How to Source: https://agentgrep.org/library/how-to/ (package-agentgrep-how-to)= # How to ## Find stores before searching ```console $ uv run agentgrep find ``` Filter discovery to Codex session files: ```console $ uv run agentgrep find sessions --agent codex ``` ## Cap result count ```console $ uv run agentgrep grep "migration" --limit 5 ``` ## Answer before the scan finishes Interactive text searches show a progress line. Press Enter on a blank line to stop scanning and print the matches collected so far. ```console $ uv run agentgrep grep "bliss" ``` ## Keep scripts quiet Use structured output and disable progress: ```console $ uv run agentgrep grep "release" --json --progress never ``` Progress, when enabled for JSON or NDJSON output, is written to stderr only. ## Use MCP from a client Connect the `agentgrep` MCP server, then ask the client to search local agent history. The client can call: - {tool}`search` for full records - {tool}`find` for store discovery - `agentgrep://capabilities` for server metadata --- # Library Source: https://agentgrep.org/library/ (library)= # Library Use `agentgrep` as a Python library from your own scripts and tools. The same search, discovery, parsing, serialization, and path-privacy layer powers the terminal CLI and the MCP server, so anything you can do from the command line you can drive directly in code. ## Install Pick an install method below — the snippet copies straight into your terminal, and the runnable quickstart mirrors the same search query you'd run from the CLI. ```{library-install} ``` ::::{grid} 1 1 2 2 :gutter: 2 :::{grid-item-card} Tutorial :link: tutorial :link-type: doc Run the CLI from first search to structured output. ::: :::{grid-item-card} How to :link: how-to :link-type: doc Common workflows for search, discovery, progress, and scripting. ::: :::{grid-item-card} API Reference :link: reference :link-type: doc Core data types, discovery, search pipeline, and progress APIs. ::: :::{grid-item-card} Examples :link: examples :link-type: doc CLI and MCP request examples. ::: :::: ```{toctree} :hidden: tutorial how-to event-stream query-language reference examples ``` --- # Query language Source: https://agentgrep.org/library/query-language/ (library-query-language)= # Query language `agentgrep grep`, `agentgrep grep`, and `agentgrep find` accept a Lucene-style query language for inline field predicates, boolean composition, and date ranges. The same syntax works across all three subcommands; each interprets the predicates against its natural record shape. The query language is **opt-in**: a bare positional like `agentgrep grep bliss` keeps the legacy fast path with zero overhead. Detection is a single-character scan for `:` in the positional tokens — if absent, the query module is never loaded. ## Grammar ``` query := disjunction disjunction := conjunction ("OR" conjunction)* conjunction := negation ("AND"? negation)* negation := ("NOT" | "-" | "+")? primary primary := group | field-expr | term group := "(" disjunction ")" field-expr := IDENT ":" field-value field-value := comparison | range | exact-value comparison := (">" | "<" | ">=" | "<=") TERM range := "[" TERM "TO" TERM "]" ; inclusive | "{" TERM "TO" TERM "}" ; exclusive exact-value := TERM term := TERM ``` Implicit AND between bare terms is preserved: `agentgrep grep foo bar` matches records containing both `foo` and `bar`. Explicit `AND` / `OR` / `NOT` are case-insensitive and must be whole words. The sigils `-` and `+` are shortcuts for `NOT` and "required" respectively. `+` is currently a no-op (implicit AND already requires every term); it's accepted for rg compatibility. ## Field registry The default registry ships ten fields, split across two evaluation layers: ### Source-level fields These can be decided from a `SourceHandle` alone, so source-level predicates prune sources before any file is opened. | Field | Kind | Notes | |---|---|---| | `agent` | enum | One of `codex`, `claude`, `cursor`, `gemini` | | `store` | string | Substring against the source's store name | | `adapter_id` | string | Substring; alias `adapter` | | `path` | path | Glob (with `*` / `?` / `[…]`) or substring | | `mtime` | date | Source-file mtime; supports `>`/`<`/`>=`/`<=` and `[a TO b]` | ### Record-level fields These need the parsed record, so they filter after the source predicate has admitted the source. | Field | Kind | Notes | |---|---|---| | `type` | enum | One of `prompts`, `history` | | `timestamp` | date | Record timestamp; supports comparison + range; alias `date` | | `model` | string | Substring against `record.model` | | `role` | string | Substring against `record.role` | | `text` | string | Substring; implicit field for bare positional terms | Unknown field names error at parse time with a clean message listing the registered fields. ## Date literals The `mtime` and `timestamp` fields accept three forms: - **ISO 8601**: `2026-05-22`, `2026-05`, `2026`, `2026-05-22T14:30:00`, `2026-05-22T14:30:00Z`, `2026-05-22T14:30:00+02:00`. - **Relative**: `today`, `yesterday`, `tomorrow`, `Nd`, `Nw`, `Nm`, `Ny` (with optional trailing ` ago`), `N(d|w|m|y) from now`. Month ≈ 30 days, year ≈ 365 days. - **Unbounded marker**: the literal `*` inside a range (`field:[* TO 2026-05]`). Bare-day equality expands to a half-open 24-hour range; bare-month to the calendar month; bare-year to the calendar year. Exact-time literals (`2026-05-22T14:30:00`) match the precise instant. ## Two-layer filtering The compiler classifies each predicate into a source-layer pass and a record-layer pass. Source-layer predicates prune sources before any file is opened; record-layer predicates filter parsed records afterward. For boolean composition: - **AND of any layers**: source-layer children prune; record-layer children filter. Each layer evaluates its own children. - **OR of same-layer children**: the OR runs cleanly at that layer. - **OR mixing source-level and record-level**: the source pass uses three-valued logic and conservatively lets the source through (the record pass decides). One OR-mixed query is the only perf cliff in the design. - **NOT** propagates per layer; a `NOT` over a mixed subtree falls back to record-only evaluation, same as OR-mixed. ## Examples ```console $ agentgrep grep agent:codex bliss ``` Records from codex matching "bliss". Claude / cursor / gemini sources are never opened. ```console $ agentgrep grep '(agent:codex OR agent:cursor) AND deploy' ``` Records from either codex or cursor mentioning "deploy". Claude / gemini are pruned at source level. ```console $ agentgrep grep '-agent:claude bliss' ``` Records from anyone except claude that mention "bliss". ```console $ agentgrep grep 'timestamp:>2026-01-01 bliss' ``` Records after 2026-01-01 mentioning "bliss". The timestamp filter runs at the record layer. ```console $ agentgrep grep 'timestamp:[2026-01 TO 2026-03] model:claude' ``` Records in Q1 2026 from any claude-* model. ```console $ agentgrep find path:~/.codex agent:codex ``` Codex-agent sources under `~/.codex/`. ```console $ agentgrep grep agent:codex bliss ``` Grep over codex records for "bliss" — same line-aware output as plain `agentgrep grep bliss`, but with the codex prefilter. ## Flag / field collisions `agentgrep` rejects ambiguous combinations of CLI flags and inline field predicates: ```console $ agentgrep grep --agent codex agent:claude bliss agentgrep grep: error: cannot combine --agent flag with agent: field predicate; pick one syntax ``` Currently checked: `--agent` × `agent:`, `--type` × `type:`. Other flags don't yet have query-field counterparts. ## Performance When the positionals contain no `:`, the query module is never imported and zero work is added — the legacy fast path runs exactly as before. When the syntax is used: - **Parse + compile** is sub-millisecond for typical queries. - **Source pruning** is O(predicates) per `SourceHandle`. Pruning saves multiple seconds on multi-thousand-file trees when a single field rules out most sources. - **Record filtering** runs in the existing per-record hot loop and short-circuits as soon as a child predicate fails. The net effect on records that pass is sub-5% overhead; rejected records save time vs. the legacy path because no haystack is built. The one perf cliff is **OR-mixed**: an OR that straddles source- and record-level predicates can't push down past the source-prune boundary. The compiler degrades safely (lets the source through; the record pass decides) — it just costs the file read. ## Known limitations ### Leading `-` on a field predicate A field predicate that begins with a bare `-` (e.g. `-agent:claude` as the negation shortcut for `NOT agent:claude`) collides with argparse's short-option collapse rule. The argv token `-agent:claude` would otherwise parse as the combined short options `-a -g -e nt:claude` because each leading character matches a defined short flag, silently turning the user's intent into a totally different command. agentgrep rejects this argv shape at parse time with a clear error and two workarounds: ```console $ agentgrep find -agent:claude agentgrep: error: argument '-agent:claude' looks like a field predicate but argparse parses the leading '-' as combined short options. Use one of: -- positional separator: agentgrep ... -- -agent:claude keyword negation: agentgrep ... 'NOT agent:claude' ``` Pick the form that fits your scripting style. The `NOT` keyword is the most readable; `--` is the most surgical. Note that shell-level quoting (`'-agent:claude'`) does **not** help — the shell strips quotes before argparse runs, so the quoted token arrives at argparse identically to the unquoted form and the pre-scan rejects both. Use `NOT` or `--`. ### `field:` with no inline value The query `agent: bliss` parses as a single `FieldEq(agent, "bliss")` predicate, not as "missing value followed by separate term `bliss`". The tokenizer emits `ident("agent"), colon` and the next term token becomes the value. Defensible (the colon's `:` separator is a contiguous operator, the space after is just whitespace) but unintuitive when typing. If you want the bare term `bliss` plus a separate `agent` predicate, write `agent:codex bliss` (filled-in value) or `bliss` (no `agent:` predicate at all). --- # API Reference Source: https://agentgrep.org/library/reference/ (package-agentgrep-reference)= # API Reference Core data types, discovery functions, and the search pipeline used by every surface (CLI, TUI, MCP). ## Core data ```{eval-rst} .. autoclass:: agentgrep.PrivatePath :members: .. autofunction:: agentgrep.format_display_path .. autoclass:: agentgrep.BackendSelection :members: .. autoclass:: agentgrep.SearchQuery :members: .. autoclass:: agentgrep.SourceHandle :members: .. autoclass:: agentgrep.SearchRecord :members: .. autoclass:: agentgrep.FindRecord :members: ``` ## Search control and progress ```{eval-rst} .. autoclass:: agentgrep.SearchControl :members: .. autoclass:: agentgrep.SearchProgress :members: .. autoclass:: agentgrep.NoopSearchProgress :members: .. autoclass:: agentgrep.ConsoleSearchProgress :members: ``` ## Discovery and search ```{eval-rst} .. autofunction:: agentgrep.select_backends .. autofunction:: agentgrep.discover_sources .. autofunction:: agentgrep.run_search_query .. autofunction:: agentgrep.search_sources .. autofunction:: agentgrep.run_find_query .. autofunction:: agentgrep.find_sources ``` --- # Tutorial Source: https://agentgrep.org/library/tutorial/ (package-agentgrep-tutorial)= # Tutorial ## Search prompts Search user prompts across all supported stores: ```console $ uv run agentgrep grep "draft pr" ``` Search only Codex prompts: ```console $ uv run agentgrep grep "draft pr" --agent codex --type prompts ``` ## Search history Search assistant and command history: ```console $ uv run agentgrep grep "pytest" --type history ``` Search prompts and history together: ```console $ uv run agentgrep grep "docs" --type all ``` ## Combine terms Require every term: ```console $ uv run agentgrep grep docs deploy ``` Match any term (all patterns use AND by default): ```console $ uv run agentgrep grep docs deploy ``` Use regular expressions (regex is the default): ```console $ uv run agentgrep grep "docs?.*deploy" ``` ## Return structured output Pretty JSON: ```console $ uv run agentgrep grep "release" --json ``` Line-delimited JSON: ```console $ uv run agentgrep grep "release" --ndjson ``` --- # MCP Source: https://agentgrep.org/mcp/ (mcp)= # MCP agentgrep's MCP server exposes a read-only search surface over stdio. It does not mutate local agent stores, open SQLite in write mode, or execute arbitrary shell commands. ## Install Pick a client, install method, and config scope. The snippet copies directly into your terminal or config file. ```{mcp-install} ``` ::::{grid} 1 1 3 3 :gutter: 2 :::{grid-item-card} Tools :link: tools :link-type: doc Invoke search and discovery. ::: :::{grid-item-card} Resources :link: resources :link-type: doc Read capabilities and source inventories. ::: :::{grid-item-card} Prompts :link: prompts :link-type: doc Reusable client-side search recipes. ::: :::{grid-item-card} API Reference :link: reference :link-type: doc Payload models, server factory, and MCP helpers. ::: :::: ## Search Tool search ## Discovery find ```{toctree} :hidden: tools resources prompts reference ``` --- # Prompts Source: https://agentgrep.org/mcp/prompts/ (mcp-prompts)= # Prompts MCP prompts are reusable recipes a client can render before calling tools. ## Search prompts ```{fastmcp-prompt} search_prompts ``` Use this when the user wants matching user prompts. ```{fastmcp-prompt-input} search_prompts ``` ## Search history ```{fastmcp-prompt} search_history ``` Use this when the user wants assistant or command history records. ```{fastmcp-prompt-input} search_history ``` ## Inspect stores ```{fastmcp-prompt} inspect_stores ``` Use this when the user wants to inspect discovered local stores before searching. ```{fastmcp-prompt-input} inspect_stores ``` --- # API Reference Source: https://agentgrep.org/mcp/reference/ (mcp-reference)= # API Reference FastMCP server factory, payload models, and MCP helpers. ## Payload models ```{eval-rst} .. autoclass:: agentgrep.mcp.SearchRecordModel :members: .. autoclass:: agentgrep.mcp.FindRecordModel :members: .. autoclass:: agentgrep.mcp.SourceRecordModel :members: .. autoclass:: agentgrep.mcp.SearchToolQuery :members: .. autoclass:: agentgrep.mcp.SearchToolResponse :members: .. autoclass:: agentgrep.mcp.FindToolQuery :members: .. autoclass:: agentgrep.mcp.FindToolResponse :members: .. autoclass:: agentgrep.mcp.BackendAvailabilityModel :members: .. autoclass:: agentgrep.mcp.CapabilitiesModel :members: ``` ## Server helpers ```{eval-rst} .. autofunction:: agentgrep.mcp.normalize_agent_selection .. autofunction:: agentgrep.mcp.list_source_models .. autofunction:: agentgrep.mcp.build_capabilities .. autofunction:: agentgrep.mcp.build_mcp_server .. autofunction:: agentgrep.mcp.main ``` --- # Resources Source: https://agentgrep.org/mcp/resources/ (mcp-resources)= # Resources MCP resources expose passive read-only data at `agentgrep://` URIs. Clients read them with `resources/read`. ## Capability summary ```{fastmcp-resource} agentgrep_capabilities ``` Read `agentgrep://capabilities` to see supported agents, adapters, tools, resources, prompts, and optional backend selections. ## Sources ```{fastmcp-resource} agentgrep_sources ``` Read `agentgrep://sources` to list every discovered source. ## Sources by agent ```{fastmcp-resource-template} agentgrep_sources_by_agent ``` Read `agentgrep://sources/codex`, `agentgrep://sources/claude`, `agentgrep://sources/cursor`, or `agentgrep://sources/gemini` to filter discovery by agent. ## Store catalog ```{fastmcp-resource} agentgrep_catalog ``` Read `agentgrep://catalog` for the canonical catalog of every store agentgrep knows about — role, format, upstream reference, and schema notes per entry. ## Store roles ```{fastmcp-resource} agentgrep_store_roles ``` Read `agentgrep://store-roles` for the enumeration of role values (`primary_chat`, `prompt_history`, `app_state`, …) with one-line descriptions. ## Store formats ```{fastmcp-resource} agentgrep_store_formats ``` Read `agentgrep://store-formats` for the enumeration of on-disk format values (`jsonl`, `sqlite`, `md_frontmatter`, …) with one-line descriptions. --- # Tools Source: https://agentgrep.org/mcp/tools/ (mcp-tools)= # Tools agentgrep's tools are read-only. They return structured Pydantic models and protect private paths before serialization. ## Prompt and History Search ```{fastmcp-tool} search :no-index: ``` **Use when** you need full prompt or history records matching one or more terms. **Returns:** query metadata plus normalized records with agent, store, adapter, path, text, title, role, timestamp, model, session ID, conversation ID, and metadata. **Example:** ```json { "tool": "search", "arguments": { "terms": ["release notes"], "agent": "all", "search_type": "prompts", "limit": 20 } } ``` ```{fastmcp-tool-input} search ``` ## Time-Windowed Activity ```{fastmcp-tool} recent_sessions ``` **Use when** you want the most-recently modified sources for an agent — newest-first, optionally bounded by a time window. **Returns:** the cutoff timestamp plus source records ordered by ``mtime_ns`` descending. ```{fastmcp-tool-input} recent_sessions ``` ## Store Discovery ```{fastmcp-tool} find ``` **Use when** you need to inspect which stores, session files, and databases agentgrep can read. **Returns:** query metadata plus source records with agent, store, adapter, protected path, path kind, and metadata. **Example:** ```json { "tool": "find", "arguments": { "pattern": "sessions", "agent": "codex", "limit": 50 } } ``` ```{fastmcp-tool-input} find ``` ## Structured Source Listing ```{fastmcp-tool} list_sources ``` **Use when** you want a structured listing of discovered sources with optional path-kind / source-kind filters. ```{fastmcp-tool-input} list_sources ``` ## Required-Pattern Filtering ```{fastmcp-tool} filter_sources ``` **Use when** you want to narrow discovered sources by required substring pattern (a stricter ``find``). ```{fastmcp-tool-input} filter_sources ``` ## Discovery Counts ```{fastmcp-tool} summarize_discovery ``` **Use when** you want aggregate counts of discovered sources by agent, format, and path-kind. ```{fastmcp-tool-input} summarize_discovery ``` ## Catalog ```{fastmcp-tool} list_stores ``` **Use when** you want the canonical catalog of on-disk stores agentgrep knows about — including stores that are not searched by default. ```{fastmcp-tool-input} list_stores ``` ```{fastmcp-tool} get_store_descriptor ``` **Use when** you need the full descriptor (role, format, upstream reference, schema notes) for a single store id. ```{fastmcp-tool-input} get_store_descriptor ``` ```{fastmcp-tool} inspect_record_sample ``` **Use when** you want a few raw records from one adapter+path to validate parser output or discover schema variations. ```{fastmcp-tool-input} inspect_record_sample ``` ## Diagnostics ```{fastmcp-tool} validate_query ``` **Use when** you want to dry-run a regex or literal pattern against sample text before issuing a broad cross-agent search. ```{fastmcp-tool-input} validate_query ``` --- # TUI Source: https://agentgrep.org/tui/ (tui)= # TUI The `agentgrep ui` command launches the interactive Textual explorer over the same Codex, Claude Code, Cursor, and Gemini stores the rest of the CLI walks. It is read-only — agentgrep never mutates the source stores. Bare `agentgrep` prints the directory of choices, so the explorer always needs the explicit `ui` subcommand. ```{note} Versions before 0.1.0a5 made bare `agentgrep` equivalent to `agentgrep ui`. That shortcut is gone. Reach the explorer through the explicit `ui` subcommand, or use the `--ui` overlay on `agentgrep grep` / `find` / `fuzzy` to open it pre-filled with that subcommand's query. ``` ## Examples Open the explorer with no seed query: ```console $ agentgrep ui ``` Seed the search bar with an initial query so the explorer dispatches a backend search immediately: ```console $ agentgrep ui bliss ``` ## Command ```{eval-rst} .. argparse:: :module: agentgrep :func: build_docs_parser :prog: agentgrep :path: ui :nodescription: ``` ## Key interactions The top input is the **search bar**. Pressing `Enter` dispatches a fresh backend search; pressing `Enter` again while a search is in flight signals the previous worker to wrap up before the next one starts, so re-querying mid-stream does not pile up cancellations. Empty / whitespace-only input parks the explorer in an idle state instead of issuing a no-op backend search. Below the results list sits a **sticky in-list filter**. Every keystroke narrows the already-loaded records without re-running the backend search, so refining a large result set is instant. Plain `up` on the filter returns focus to the search bar; plain `right` on an empty filter releases focus to the detail pane, so the full arrow-key perimeter walks the three columns without reaching for `Ctrl-L`. A non-empty `right` keeps cursor-in-input semantics. Each pane carries a footer **status line**. The results footer shows match count, cursor position, and a tig-style scroll percent that reads `100%` when the view fits; the detail footer shows the compact source path and the same scroll percent. Result-row timestamps render in the viewer's local timezone with offset (`YYYY-MM-DD HH:MM ±HHMM`), formatted via {func}`~agentgrep.format_timestamp_tig`. ::::{grid} 1 1 2 2 :gutter: 2 :::{grid-item-card} API Reference :link: reference :link-type: doc UIArgs, entry points, filter and display helpers. ::: :::: ## See also - {ref}`cli` — the `--ui` flag on any search-shaped subcommand opens the same explorer pre-seeded with that subcommand's query (e.g. `agentgrep grep bliss --agent codex --ui`). ```{toctree} :hidden: reference ``` --- # API Reference Source: https://agentgrep.org/tui/reference/ (tui-reference)= # API Reference The `agentgrep.ui` subpackage holds the streaming Textual explorer. Textual is imported lazily inside {func}`~agentgrep.ui.app.build_streaming_ui_app` via `importlib.import_module`, so bare `import agentgrep` does not pull Textual into the importing process; the import error is deferred to the moment the factory is called. The subpackage's `__init__` re-exports {func}`~agentgrep.ui.app.run_ui` and {func}`~agentgrep.ui.app.build_streaming_ui_app` at the `agentgrep.ui` namespace for convenience, and the top-level `agentgrep` package provides matching lazy wrappers so callers can use `agentgrep.run_ui()` without reaching into `agentgrep.ui.app`. ## Argument type ```{eval-rst} .. autoclass:: agentgrep.UIArgs :members: ``` ## Entry points ```{eval-rst} .. autofunction:: agentgrep.ui.app.run_ui .. autofunction:: agentgrep.ui.app.build_streaming_ui_app ``` ## Filter and display helpers ```{eval-rst} .. autofunction:: agentgrep.cached_haystack .. autofunction:: agentgrep.clear_haystack_cache .. autofunction:: agentgrep.compute_filter_matches .. autofunction:: agentgrep.format_timestamp_tig .. autofunction:: agentgrep.ui.app.scroll_percent ``` ---