ADR 0012: Reusable TUI widget architecture¶
Status¶
Accepted. Implementation proceeds as the strangler-fig sequence in § Implementation sequence; each step lands behind the completion gate and keeps app.py a thin facade.
Context¶
The interactive explorer is assembled by build_streaming_ui_app, which lazily imports Textual (textual.app, textual.containers, textual.widgets) and then defines class AgentGrepApp(app_type) inside the factory closure. That closure-defined class is a single god-object: search and filter worker dispatch, the detail-pane rendering pipeline (header build, JSON/Markdown/plain body, LRU caches, wrap-aware find-in-detail), responsive layout, pane focus routing, staged Ctrl-C exit, slash-command dispatch, and every cross-widget message handler all live in one body. It can only be constructed through the factory, so it is neither independently unit-testable nor reusable, and it is the single largest obstacle to evolving the UI. The dynamic base is also why the tree carries # ty: ignore[unsupported-base].
The leaf widgets, by contrast, are already extracted: agentgrep.ui.widgets holds plain Textual subclasses — SearchResultsList over OptionList, DetailScroll over VerticalScroll, the Input-derived SearchInput/FilterInput/DetailFindInput, HistoryRecall over ModalScreen, and the Static-based status chrome — each imported only inside the factory per ADR 0010 and each unit-tested without a live App. The non-blocking spine is already codified by ADR 0011 (NB-1…NB-10) and the agentgrep.ui._runtime primitives. So the remaining work is finishing the strangler extraction of the App object, not building a framework.
A second motivation is a recurring question: should agentgrep adopt the widget surface of richer terminal UIs such as pi (its from-scratch differential-render library @earendil-works/pi-tui) and ink (React-reconciler-over-terminal)? This ADR records the comparative analysis and the answer: most of that capability is either already present or a small generalization on top of Textual, a handful of items are genuine but out-of-scope for a read-only search tool, and Textual lacks no architectural component that agentgrep needs. The capability mapping is recorded so future contributors do not re-open the question or import an unneeded abstraction.
This ADR does not introduce a generic widget framework. agentgrep ships exactly one frontend, so “reusable” here means independently typed, testable, and ADR-0011-guarded leaf widgets behind a narrow engine seam — not a plugin base class with no second consumer.
Decision¶
The TUI is organized as five layers with a one-directional dependency flow (engine seam ← view widgets ← app shell; theme and message contracts are shared leaves). The App shell owns composition and dispatch; it owns no rendering or matching logic.
Layer |
Responsibility |
Key types |
|---|---|---|
App shell |
Screen composition, global |
|
View widgets |
Reusable leaf Textual subclasses that render normalized records and emit typed messages; pump-thread behavior only. |
|
Message / view-model contracts |
Typed |
|
Engine seam (Protocols) |
Narrow |
|
Theme / styles |
pi-lite palette tokens, terminal transparency, docked layout — centralized, not per-widget. |
|
The following invariants govern the layer (RW for reusable widget), in the enumerated style of ADR 0011:
RW-1 — Widgets consume normalized records, never engine internals. A view widget imports
agentgrep.records(SearchRecord/FindRecord) and the engine-seamProtocols only; it must not importagentgrep._engine,agentgrep.query, oragentgrep.stores. Search is reached throughSearchInvoker; aPreviewProviderpreview seam is deferred until a second selector consumer exists (RW-7).RW-2 — State leaves a widget as a typed
Message, not a back-reference. A widget posts aMessagesubclass carrying a pre-shaped dataclass /NamedTuple; it does not mutate sibling widgets throughself.app. This matches the existingagentgrep.ui.widgets.messagesmodule.RW-3 — Every widget is constructable and testable without a live App. Pure construction plus
App.run_test()+Pilotdriving, with syrupy snapshots of renderedContent. No tty, filesystem, subprocess, or live engine in widget tests; the engine seam is faked. This is the pattern already intests/test_ui_widgets.py,tests/test_ui_history_modal.py, andtests/test_tui_non_blocking.py.RW-4 — Widget state is typed reactive. Every
reactiveattribute is annotatedreactive[Concrete]; no bareAny; every handler names its preciseMessagesubtype. The only permitted ty suppressions are the two already in the tree (unsupported-basefor the dynamic App base, theModalScreen[T]runtime-subscriptnoqa).RW-5 — Widgets honor the non-blocking catalog. Engine work runs in
thread=True, exclusive=Trueworkers behind a stable group; high-frequency results return viacall_from_thread; pump callables stay O(1). This is ADR 0011 NB-1…NB-10, unchanged — RW-5 binds the widgets to it rather than restating it.RW-6 — The App shell owns composition and dispatch only. No rendering, matching, ranking, or record-detail construction logic lives on the App; those belong to view widgets or the engine behind the seam.
RW-7 — Optional pi-parity widgets are gated. Any widget marked OPTIONAL below ships only behind its own issue/ADR with a measured baseline first, per the measurement-first rule of ADR 0003. No differential-render or frame-time performance claim is made without a named baseline; the explorer relies on Textual’s compositor and
OptionListline caching, and measures with the hang-fuzz harness andscripts/profile_engine.pybefore optimizing.RW-8 —
app.pystays a thin importing facade during extraction. Each strangler step keepsapp.pyre-exporting the moved symbol so the step is independently revertable, per ADR 0010.
Reusable widget catalog¶
CORE widgets are required for search now and mostly already exist; OPTIONAL widgets are pi-parity nice-to-haves gated by RW-7.
Widget |
Tier |
Base |
Role |
|---|---|---|---|
|
CORE |
Append-only streaming result rows; |
|
|
CORE |
Record detail with per-record scroll memory; heavy renderables built off-thread (NB-9). |
|
|
CORE |
|
|
|
CORE → OPTIONAL |
|
|
|
CORE |
|
Slash/field completion over the in-process |
Status chrome |
CORE |
|
|
|
OPTIONAL |
|
Static render of an already-persisted record; no token-stream reflow path. |
|
OPTIONAL |
|
|
|
OPTIONAL/CUT |
Emacs kill-ring multiline editor; cut — a single-line search box does not need it. |
Capability mapping¶
What ink has that Textual does not¶
The honest finding: nothing architectural that agentgrep needs. ink’s declarative-React machinery is replaced by Textual’s reactive descriptors, compose(), and the compositor; the one genuine model difference is layout (Yoga flexbox vs. Textual CSS), and for a fixed docked shell Textual’s model is the better fit.
ink concept |
ink source (v7.1.0) |
Textual counterpart (v8.2.6) |
Verdict |
|---|---|---|---|
React reconciler / vDOM diff |
|
Parity / superior — no reconciler to port; keep watchers O(1) (NB-5). |
|
Hooks ( |
|
Parity — map each hook to a typed |
|
|
|
Parity — |
|
|
TCSS dock/grid/ |
Gap in model — flexbox idioms do not port 1:1; CSS+dock is superior for the docked shell. Layout lives in |
|
|
|
Parity — declarative and typed. |
|
|
Focus chain on |
Superior — tab-order and modal focus restore are free. |
|
Alt-screen / raw-mode lifecycle |
|
Superior — delete the concern entirely. |
|
Streaming-token markdown reflow |
ink re-reflows on each render |
|
Gap, out of scope — agentgrep has no live token producer; records are static, so partial-fence handling never arises. |
pi capability → Textual¶
Every notable capability of pi-tui and pi’s interactive components, mapped to the Textual path. pi’s differential renderer is a from-scratch cell diff; Textual’s compositor already provides the equivalent, so it is not reproduced.
pi capability |
pi source (v0.80.2) |
Textual path |
Effort |
Tier |
|---|---|---|---|---|
Incremental result streaming |
engine-driven |
|
builtin |
CORE — already implemented |
Append-only scrollback |
|
builtin |
CORE |
|
Fuzzy selector + live preview |
Shipped concretely as |
small |
OPTIONAL |
|
Slash / field completion |
pi autocomplete |
|
builtin |
CORE |
Focus traversal + modal trap |
pi overlay model |
Textual focus chain + |
small |
CORE — deliverable is the documented focus graph + Pilot tests |
Spinner / progress chrome |
pi |
|
builtin |
CORE |
Cancellation / supersede |
pi keybindings |
|
builtin |
CORE |
Terminal-transparent aesthetic |
pi themes |
|
builtin |
CORE |
Streaming-markdown transcript |
Static |
medium |
OPTIONAL — no live token producer |
|
Emacs kill-ring editor |
|
medium |
OPTIONAL/CUT — single-line box; |
|
Full conversation browsing |
pi session view |
|
medium |
OPTIONAL — needs an engine incremental-detail fetch behind its own ADR |
Why no native code¶
pi ships native C (darwin-modifiers.c, win32-console-mode.c) only for key-modifier and console-mode detection, not rendering. Textual already handles raw mode, the alternate screen, and key parsing across platforms, so there is no native boundary to open here; this stays within the no-native-by-default rule of ADR 0003. The non-blocking offload that makes streaming search safe rests on CPython’s concurrent.futures.ThreadPoolExecutor (under Textual’s thread workers) handing results back through the loop via call_from_thread, itself built on asyncio — all standard library, no accelerator.
Engine changes¶
CORE widgets need no execution-engine change. The engine already streams results incrementally with cooperative SearchControl cancellation and chunked emission per ADR 0004, and the TUI already consumes it inside thread=True workers per ADR 0011. The extraction must preserve, not modify, that boundary. The only addition is a UI-layer import seam — the SearchInvoker Protocol in agentgrep.ui — so the app shell calls a narrow interface instead of importing engine internals; it changes no search semantics and adds no native code. (The PreviewProvider preview seam is deferred with the OPTIONAL selector work.) The OPTIONAL ConversationScrollbackLog would require an engine record-detail fetch that yields a scope=conversations transcript incrementally; the scope parameter already exists, but that fetch is deferred behind its own issue/ADR with a measured baseline.
Implementation sequence¶
Strangler-fig, one concern per gate-green commit, app.py a thin facade throughout (RW-8). The bite-sized, test-first task breakdown lives in the working plan referenced by the resumable loop prompt; the durable order is:
Pin behavior. Confirm characterization tests — pure widget tests plus Pilot/syrupy snapshots for each existing leaf widget, plus the ADR 0011 guard tests. No production code moves.
Introduce the engine seam. Define the
SearchInvokerProtocolinagentgrep.ui, wire it into the search worker, and add a fake-Protocol test fixture. No behavior change. (ThePreviewProviderpreview seam is deferred to the OPTIONAL selector work below.)De-closure the App. Lift the App subclass out of
build_streaming_ui_appinto a module (agentgrep.ui.app_screen), keepingbuild_streaming_ui_appa thin assembling facade.Normalize CORE widget contracts, one widget per commit (results, detail, the
Inputfamily, status chrome, dropdown): typedreactive[Concrete],Messagepayload dataclasses, NumPy docstrings, a per-widget Pilot+syrupy test.(OPTIONAL, RW-7-gated) Generalize the fuzzy selector. Keep
HistoryRecallconcrete. Only when a second selector consumer lands (e.g. the OPTIONALscope=conversationssession picker), extractFuzzySelectorModal, re-expressHistoryRecallas a thin subclass, and add thePreviewProviderseam — wiring its preview through the same non-blocking spine as search (debounced selection-change → exclusivegroup="preview"worker → generation-token stale drop →call_from_thread).Document and test the focus graph. Tab order, modal trap/restore, and global-vs-focused key precedence, with Pilot focus-traversal tests.
Stop for CORE; gate OPTIONAL work. File
MarkdownRecordDetail,ConversationScrollbackLog, andKillRingTextAreaas separate issues/ADRs, each requiring a measured baseline before any code (RW-7).
Per-step exit criterion (every step): rm -rf docs/_build; uv run ruff check . --fix --show-fixes; uv run ruff format .; uv run ty check; uv run py.test --reruns 0 -vvv; just build-docs exits 0, and app.py still imports every extracted symbol.
Consequences¶
The UI gains a typed, testable widget layer with a documented engine seam, and the closure god-object shrinks step by step into an App shell that only composes and dispatches. Each step is independently revertable because app.py keeps re-exporting moved symbols, and each is provable because the characterization snapshots and the ADR 0011 guards run at every gate. The capability question is settled in writing, so contributors neither re-derive the pi/ink comparison nor import an unneeded reconciler, flexbox engine, or kill-ring editor.
The chief risks are extraction drift (the closure captures many locals — mitigated by characterization-tests-first and the facade re-export), ty Any-leaks from untyped reactives or dynamic handlers (mitigated by RW-4 and the two existing suppressions), and scope creep from the OPTIONAL pi-parity widgets (mitigated by RW-7 gating them behind their own baselines). No performance claim is made without a named baseline.
Final position¶
The reusable-widget layer is finished by extraction, not invention: CORE widgets ship now behind the SearchInvoker seam under RW-1…RW-8 and the ADR 0011 non-blocking catalog; OPTIONAL pi-parity widgets — including the generalized FuzzySelectorModal + PreviewProvider, which RW-7 gates on a second selector consumer — stay deferred behind their own measured baselines. Textual supplies every architectural primitive agentgrep needs, so no native code, reconciler, or flexbox engine is adopted.