ADR 0012: Reusable TUI widget architecture¶

Status¶

Accepted. Implementation proceeds as the strangler-fig sequence in § Implementation sequence; each step lands behind the completion gate and keeps app.py a thin facade.

Context¶

The interactive explorer is assembled by build_streaming_ui_app, which lazily imports Textual (textual.app, textual.containers, textual.widgets) and then defines class AgentGrepApp(app_type) inside the factory closure. That closure-defined class is a single god-object: search and filter worker dispatch, the detail-pane rendering pipeline (header build, JSON/Markdown/plain body, LRU caches, wrap-aware find-in-detail), responsive layout, pane focus routing, staged Ctrl-C exit, slash-command dispatch, and every cross-widget message handler all live in one body. It can only be constructed through the factory, so it is neither independently unit-testable nor reusable, and it is the single largest obstacle to evolving the UI. The dynamic base is also why the tree carries # ty: ignore[unsupported-base].

The leaf widgets, by contrast, are already extracted: agentgrep.ui.widgets holds plain Textual subclasses — SearchResultsList over OptionList, DetailScroll over VerticalScroll, the Input-derived SearchInput/FilterInput/DetailFindInput, HistoryRecall over ModalScreen, and the Static-based status chrome — each imported only inside the factory per ADR 0010 and each unit-tested without a live App. The non-blocking spine is already codified by ADR 0011 (NB-1…NB-10) and the agentgrep.ui._runtime primitives. So the remaining work is finishing the strangler extraction of the App object, not building a framework.

A second motivation is a recurring question: should agentgrep adopt the widget surface of richer terminal UIs such as pi (its from-scratch differential-render library @earendil-works/pi-tui) and ink (React-reconciler-over-terminal)? This ADR records the comparative analysis and the answer: most of that capability is either already present or a small generalization on top of Textual, a handful of items are genuine but out-of-scope for a read-only search tool, and Textual lacks no architectural component that agentgrep needs. The capability mapping is recorded so future contributors do not re-open the question or import an unneeded abstraction.

This ADR does not introduce a generic widget framework. agentgrep ships exactly one frontend, so “reusable” here means independently typed, testable, and ADR-0011-guarded leaf widgets behind a narrow engine seam — not a plugin base class with no second consumer.

Decision¶

The TUI is organized as five layers with a one-directional dependency flow (engine seam ← view widgets ← app shell; theme and message contracts are shared leaves). The App shell owns composition and dispatch; it owns no rendering or matching logic.

Layer	Responsibility	Key types
App shell	Screen composition, global `BINDINGS`, lifecycle, worker dispatch, cooperative cancellation, stale-event generation tokens.	`ExplorerApp(App)`; `SearchControl`; generation token (`int`); `run_worker(thread=True, exclusive=True, group=...)`
View widgets	Reusable leaf Textual subclasses that render normalized records and emit typed messages; pump-thread behavior only.	`SearchResultsList(OptionList)`; `DetailScroll(VerticalScroll)`; `DebouncedQueryInput(Input)`; `HistoryRecall(ModalScreen)`; `CompletionDropdown(OptionList)`; status chrome over `Static`
Message / view-model contracts	Typed `Message` subclasses carrying pre-shaped dataclass / `NamedTuple` payloads, so widgets never reach into the app.	`SearchRequested`; `FilterRequested`; `FilterCompleted`; `ResultsScrollChanged`; `DetailScrollChanged`; `ResultsStatusSnapshot`
Engine seam (Protocols)	Narrow `Protocol`s the app/widgets call instead of importing engine internals; faked in tests.	`SearchInvoker(Protocol)` (shipped); `PreviewProvider(Protocol)` (OPTIONAL — deferred until a second selector); `SearchRecord`; `FindRecord`
Theme / styles	pi-lite palette tokens, terminal transparency, docked layout — centralized, not per-widget.	`theme.py` token maps; `styles.tcss`

The following invariants govern the layer (RW for reusable widget), in the enumerated style of ADR 0011:

RW-1 — Widgets consume normalized records, never engine internals. A view widget imports agentgrep.records (SearchRecord / FindRecord) and the engine-seam Protocols only; it must not import agentgrep._engine, agentgrep.query, or agentgrep.stores. Search is reached through SearchInvoker; a PreviewProvider preview seam is deferred until a second selector consumer exists (RW-7).
RW-2 — State leaves a widget as a typed Message, not a back-reference. A widget posts a Message subclass carrying a pre-shaped dataclass / NamedTuple; it does not mutate sibling widgets through self.app. This matches the existing agentgrep.ui.widgets.messages module.
RW-3 — Every widget is constructable and testable without a live App. Pure construction plus App.run_test() + Pilot driving, with syrupy snapshots of rendered Content. No tty, filesystem, subprocess, or live engine in widget tests; the engine seam is faked. This is the pattern already in tests/test_ui_widgets.py, tests/test_ui_history_modal.py, and tests/test_tui_non_blocking.py.
RW-4 — Widget state is typed reactive. Every reactive attribute is annotated reactive[Concrete]; no bare Any; every handler names its precise Message subtype. The only permitted ty suppressions are the two already in the tree (unsupported-base for the dynamic App base, the ModalScreen[T] runtime-subscript noqa).
RW-5 — Widgets honor the non-blocking catalog. Engine work runs in thread=True, exclusive=True workers behind a stable group; high-frequency results return via call_from_thread; pump callables stay O(1). This is ADR 0011 NB-1…NB-10, unchanged — RW-5 binds the widgets to it rather than restating it.
RW-6 — The App shell owns composition and dispatch only. No rendering, matching, ranking, or record-detail construction logic lives on the App; those belong to view widgets or the engine behind the seam.
RW-7 — Optional pi-parity widgets are gated. Any widget marked OPTIONAL below ships only behind its own issue/ADR with a measured baseline first, per the measurement-first rule of ADR 0003. No differential-render or frame-time performance claim is made without a named baseline; the explorer relies on Textual’s compositor and OptionList line caching, and measures with the hang-fuzz harness and scripts/profile_engine.py before optimizing.
RW-8 — app.py stays a thin importing facade during extraction. Each strangler step keeps app.py re-exporting the moved symbol so the step is independently revertable, per ADR 0010.

Reusable widget catalog¶

CORE widgets are required for search now and mostly already exist; OPTIONAL widgets are pi-parity nice-to-haves gated by RW-7.

Widget	Tier	Base	Role
`SearchResultsList`	CORE	`OptionList`	Append-only streaming result rows; `append_records` is O(batch), no prior-row relayout.
`DetailScroll`	CORE	`VerticalScroll`	Record detail with per-record scroll memory; heavy renderables built off-thread (NB-9).
`DebouncedQueryInput` family	CORE	`Input`	`SearchInput` / `FilterInput` / `DetailFindInput`; debounced typed-message emit (NB-3).
`HistoryRecall` → `FuzzySelectorModal`	CORE → OPTIONAL	`ModalScreen`	`HistoryRecall` already ships the Ctrl-R recall selector (in-memory fuzzy filter + preview + focus trap). Generalizing it into a reusable `FuzzySelectorModal` is RW-7-gated: `HistoryRecall` is the only `ModalScreen` and `^p` is Textual’s built-in palette, so there is one consumer — defer the base class until a second selector lands.
`CompletionDropdown` + `QuerySuggester`	CORE	`OptionList` / `Suggester`	Slash/field completion over the in-process `FieldRegistry`.
Status chrome	CORE	`Static`	`PaneHeader` / `ResultsHeader` / `SearchingPanel` / `SpinnerWidget` / `MeterWidget`; O(1) updates from pre-shaped snapshots.
`MarkdownRecordDetail`	OPTIONAL	`DetailScroll` + `Markdown`	Static render of an already-persisted record; no token-stream reflow path.
`ConversationScrollbackLog`	OPTIONAL	`RichLog`	`scope=conversations` transcript browsing; append-only with a retention cap.
`KillRingTextArea`	OPTIONAL/CUT	`TextArea`	Emacs kill-ring multiline editor; cut — a single-line search box does not need it.

Capability mapping¶

What ink has that Textual does not¶

The honest finding: nothing architectural that agentgrep needs. ink’s declarative-React machinery is replaced by Textual’s reactive descriptors, compose(), and the compositor; the one genuine model difference is layout (Yoga flexbox vs. Textual CSS), and for a fixed docked shell Textual’s model is the better fit.

ink concept	ink source (v7.1.0)	Textual counterpart (v8.2.6)	Verdict
React reconciler / vDOM diff	`reconciler.ts`, `dom.ts`	`_compositor.py` + per-region invalidation	Parity / superior — no reconciler to port; keep watchers O(1) (NB-5).
Hooks (`useState`/`useEffect`)	`hooks/`	`reactive.py` + `watch_`/`compute_`	Parity — map each hook to a typed `reactive[Concrete]`.
`<Static>` append-only output	`components/Static.tsx`	`OptionList.add_option` / `RichLog.write`	Parity — `SearchResultsList.append_records` already maps this.
`<Box>`/`<Text>` + Yoga flexbox	`components/Box.tsx`, `styles.ts`	TCSS dock/grid/`fr` + `containers.py`	Gap in model — flexbox idioms do not port 1:1; CSS+dock is superior for the docked shell. Layout lives in `styles.tcss`.
`useInput` keyboard	`hooks/use-input.ts`	`BINDINGS` + `action_*` + `on_key`	Parity — declarative and typed.
`useFocus` / `useFocusManager`	`use-focus.ts`, `use-focus-manager.ts`	Focus chain on `widget.py` + `ModalScreen` trap/restore	Superior — tab-order and modal focus restore are free.
Alt-screen / raw-mode lifecycle	`ink.tsx`	`App` manages it	Superior — delete the concern entirely.
Streaming-token markdown reflow	ink re-reflows on each render	`Markdown.update` / `RichLog.write`	Gap, out of scope — agentgrep has no live token producer; records are static, so partial-fence handling never arises.

pi capability → Textual¶

Every notable capability of pi-tui and pi’s interactive components, mapped to the Textual path. pi’s differential renderer is a from-scratch cell diff; Textual’s compositor already provides the equivalent, so it is not reproduced.

pi capability	pi source (v0.80.2)	Textual path	Effort	Tier
Incremental result streaming	engine-driven	`SearchResultsList.append_records` fed by a `thread=True` worker via `call_from_thread`, chunked apply (NB-3/NB-4)	builtin	CORE — already implemented
Append-only scrollback	`tui.ts`	`OptionList.add_option`; `RichLog.write` for the optional transcript	builtin	CORE
Fuzzy selector + live preview	`select-list.ts`, `session-selector.ts`	Shipped concretely as `HistoryRecall` (`textual.fuzzy.Matcher` filter + preview). A reusable `FuzzySelectorModal` + worker-backed `PreviewProvider` is RW-7-gated (one consumer today)	small	OPTIONAL
Slash / field completion	pi autocomplete	`CompletionDropdown` + `QuerySuggester(Suggester)` over `FieldRegistry`	builtin	CORE
Focus traversal + modal trap	pi overlay model	Textual focus chain + `ModalScreen` auto trap/restore	small	CORE — deliverable is the documented focus graph + Pilot tests
Spinner / progress chrome	pi `loader.ts`	`SpinnerWidget(Static)` driven by `set_interval`; snapshots into `SearchingPanel`/`MeterWidget`	builtin	CORE
Cancellation / supersede	pi keybindings	`SearchControl` polled in worker loops; `run_worker(exclusive=True)` (NB-6/NB-7)	builtin	CORE
Terminal-transparent aesthetic	pi themes	`App.ansi_color=True` + ansi-default tokens in `theme.py`; pi-lite rules in `styles.tcss`	builtin	CORE
Streaming-markdown transcript	`markdown.ts`	Static `MarkdownRecordDetail`; full parse off-thread, no per-token path	medium	OPTIONAL — no live token producer
Emacs kill-ring editor	`editor.ts`	`KillRingTextArea(TextArea)`	medium	OPTIONAL/CUT — single-line box; `Input` already covers editing
Full conversation browsing	pi session view	`ConversationScrollbackLog(RichLog)` for `scope=conversations`	medium	OPTIONAL — needs an engine incremental-detail fetch behind its own ADR

Why no native code¶

pi ships native C (darwin-modifiers.c, win32-console-mode.c) only for key-modifier and console-mode detection, not rendering. Textual already handles raw mode, the alternate screen, and key parsing across platforms, so there is no native boundary to open here; this stays within the no-native-by-default rule of ADR 0003. The non-blocking offload that makes streaming search safe rests on CPython’s concurrent.futures.ThreadPoolExecutor (under Textual’s thread workers) handing results back through the loop via call_from_thread, itself built on asyncio — all standard library, no accelerator.

Engine changes¶

CORE widgets need no execution-engine change. The engine already streams results incrementally with cooperative SearchControl cancellation and chunked emission per ADR 0004, and the TUI already consumes it inside thread=True workers per ADR 0011. The extraction must preserve, not modify, that boundary. The only addition is a UI-layer import seam — the SearchInvoker Protocol in agentgrep.ui — so the app shell calls a narrow interface instead of importing engine internals; it changes no search semantics and adds no native code. (The PreviewProvider preview seam is deferred with the OPTIONAL selector work.) The OPTIONAL ConversationScrollbackLog would require an engine record-detail fetch that yields a scope=conversations transcript incrementally; the scope parameter already exists, but that fetch is deferred behind its own issue/ADR with a measured baseline.

Implementation sequence¶

Strangler-fig, one concern per gate-green commit, app.py a thin facade throughout (RW-8). The bite-sized, test-first task breakdown lives in the working plan referenced by the resumable loop prompt; the durable order is:

Pin behavior. Confirm characterization tests — pure widget tests plus Pilot/syrupy snapshots for each existing leaf widget, plus the ADR 0011 guard tests. No production code moves.
Introduce the engine seam. Define the SearchInvoker Protocol in agentgrep.ui, wire it into the search worker, and add a fake-Protocol test fixture. No behavior change. (The PreviewProvider preview seam is deferred to the OPTIONAL selector work below.)
De-closure the App. Lift the App subclass out of build_streaming_ui_app into a module (agentgrep.ui.app_screen), keeping build_streaming_ui_app a thin assembling facade.
Normalize CORE widget contracts, one widget per commit (results, detail, the Input family, status chrome, dropdown): typed reactive[Concrete], Message payload dataclasses, NumPy docstrings, a per-widget Pilot+syrupy test.
(OPTIONAL, RW-7-gated) Generalize the fuzzy selector. Keep HistoryRecall concrete. Only when a second selector consumer lands (e.g. the OPTIONAL scope=conversations session picker), extract FuzzySelectorModal, re-express HistoryRecall as a thin subclass, and add the PreviewProvider seam — wiring its preview through the same non-blocking spine as search (debounced selection-change → exclusive group="preview" worker → generation-token stale drop → call_from_thread).
Document and test the focus graph. Tab order, modal trap/restore, and global-vs-focused key precedence, with Pilot focus-traversal tests.
Stop for CORE; gate OPTIONAL work. File MarkdownRecordDetail, ConversationScrollbackLog, and KillRingTextArea as separate issues/ADRs, each requiring a measured baseline before any code (RW-7).

Per-step exit criterion (every step): rm -rf docs/_build; uv run ruff check . --fix --show-fixes; uv run ruff format .; uv run ty check; uv run py.test --reruns 0 -vvv; just build-docs exits 0, and app.py still imports every extracted symbol.

Consequences¶

The UI gains a typed, testable widget layer with a documented engine seam, and the closure god-object shrinks step by step into an App shell that only composes and dispatches. Each step is independently revertable because app.py keeps re-exporting moved symbols, and each is provable because the characterization snapshots and the ADR 0011 guards run at every gate. The capability question is settled in writing, so contributors neither re-derive the pi/ink comparison nor import an unneeded reconciler, flexbox engine, or kill-ring editor.

The chief risks are extraction drift (the closure captures many locals — mitigated by characterization-tests-first and the facade re-export), ty Any-leaks from untyped reactives or dynamic handlers (mitigated by RW-4 and the two existing suppressions), and scope creep from the OPTIONAL pi-parity widgets (mitigated by RW-7 gating them behind their own baselines). No performance claim is made without a named baseline.

Final position¶

The reusable-widget layer is finished by extraction, not invention: CORE widgets ship now behind the SearchInvoker seam under RW-1…RW-8 and the ADR 0011 non-blocking catalog; OPTIONAL pi-parity widgets — including the generalized FuzzySelectorModal + PreviewProvider, which RW-7 gates on a second selector consumer — stay deferred behind their own measured baselines. Textual supplies every architectural primitive agentgrep needs, so no native code, reconciler, or flexbox engine is adopted.