Browser rebuilt, delegate act-trail restored, recall fan-out removed

The browser tool is rebuilt from the ground up as 9 flat verbs (open/read/find/click/fill/select/scroll/back/screenshot) on a single persistent Playwright page per delegate run, returning a uniform {page,data,changed,error} envelope with a mechanical post-action diff and visible-text-only targeting via get_by_role/label/text. Screenshots now route through ingest_file() into a screenshots/ subdir with source_type=‘screenshot’, so the web_browse delegate reads them back via the vision tool — replacing the deleted in-browser OCR path. The delete took ~750 LOC of overreach with it: credential vault, interaction DSL, browser_snapshots and browser_credentials tables, and the browser REST API.

web_search and web_browse delegates ran blind to their own act-trail — re-issuing the same searches until max_iterations=50, with a live repro of a 2-part factual query burning 12.5 minutes / 123+ delegate LLM calls. Root cause was a flag conflation: both delegates set skip_transcript=True AND skip_input_row=True under a ‘clean context’ banner, but skip_input_row (HiddenInput) is the async-return mechanism, not a delegate property. The dual-True tripped the _setup uid guard, so _render_act_trail short-circuited to ‘’ and get_user_prompt was a constant, results-blind prompt every iteration. Fix: both delegate configs now set skip_transcript=False / skip_input_row=False so _setup writes a per-turn delegate:web_search / delegate:web_browse transcript row and assigns the uid the act-trail needs; suppress_history=True is kept for clean cross-turn context. PatternConfig.get_user_prompt gains a channel NOT LIKE 'delegate:%' guard so internal research loops don’t pollute behavioural pattern analysis.

memory.recall no longer silently fans out to document.search and schedule.search behind the model’s back — explicit (model-invoked) recall now appends a one-line guardrail steering the model to call those tools on its own judgement, and the turn-0 auto-seed recall stays silent with no guardrail and no fan-out. The turn-0 seed is also 30% narrower, now using SEED_RADIUS_BASELINE = 0.35 (vs RECALL_RADIUS_BASELINE 0.5), selected via caller=‘seed’ in _handle_recall; explicit recalls stay at 0.5. The companion web_browse tool-summary rewrite (TKT-877) advertises the screenshot+vision affordance so the orchestrator no longer refuses to delegate — the delegate takes screenshots it inspects with its own vision, persisted as documents any vision-capable tool can view by doc_id.

Both compaction system prompts are rewritten as fixed-structure contracts. ChatHistoryCompactionSystemPrompt shrinks from 430 to 244 words as a single living-document (Person / Now / Holding / Open / Voice / Last, 200-400 token target, output consumed verbatim). ToolChainCompactionSystemPrompt replaces its free-form ‘dense paragraph’ instruction with a fixed four-part handover (Goal / Done / Failed / Next) plus a never-state-a-value-a-tool-did-not-return hallucination guard. Two contract feature tests pin the slimmed shape against the production entry points; both failed pre-rewrite, pass post.

Voice model remediation hints now point to Settings, not the installer. Voice ONNX models stopped shipping with the installer in c3633e0c — downloads moved into RuntimeDepsService.enable_voice() triggered by the voice_enabled setting, but two user-facing ‘not ready’ hints (api/voice.py _loading_or_missing_response + voice_health) and the module docstring/comments still told users to re-run the installer. Both hints now say ‘Enable voice in Settings to download voice models.’ A follow-up doc sweep updated 04-ARCHITECTURE, 01-QUICK-START (removing a dead --disable-voice flag), and 03-WEB-INTERFACE TTS paragraph to match.

The MagicMock-saturation scope (TKT-646) is now fully resolved — 9 files were deleted earlier (mock-theater removed), test_memorystore_ttl split out, and the 2 survivors are rewritten here against the real stack. test_pattern_match_processor now drives 3 real SubconsciousWorker ._step_pattern_match() ticks and 51 real save_graph tool calls through the ACT loop (asserting 50 rows land in data_graph). test_document_api gains a real-stack POST /documents/upload feature test via authed_client (real Flask + SQLite + MemoryStore). Zero business-logic mocks remain; only the sanctioned LLM-provider seam and auth/filesystem isolation seams survive.

Three cleanup commits scrub internal-infra references from comments, docstrings, CHANGELOG, docs, and ability examples — stripping ticket IDs, internal tracker doc pointers, the local MCP host, and private test-harness references. Internal scenario filenames and numbered run/scenario IDs are reworded to descriptive prose; ‘nightly scenario/run/suite/harness’ becomes neutral ‘end-to-end scenario/suite’. All taskie MCP-server examples (taskie is open-source) and test fixtures are preserved verbatim. Separately, conftest’s _preload_db_bound_modules is removed — it imported the deleted services.dmn_service and silently swallowed the ModuleNotFoundError, making it a no-op since 2026-05-01; replaced with a breadcrumb comment documenting the per-test pattern that handles the underlying hazard.

Browser tool rebuilt as 9 flat verbs on a persistent Playwright page; screenshots route through ingest_file() with source_type=‘screenshot’ so the delegate’s vision tool reads them back; ~750 LOC deleted
web_search/web_browse delegate fix: skip_transcript=False / skip_input_row=False so _setup assigns the act-trail uid; live repro was 12.5min / 123+ LLM calls; PatternConfig now excludes delegate:% rows from pattern windows
memory.recall no longer silently fans out to document/schedule; explicit recall appends a guardrail steering the model to call those tools itself; turn-0 seed radius is 30% narrower (0.35 vs 0.5)
Both compaction prompts rewritten as fixed-structure contracts (history: 430→244 words as Person/Now/Holding/Open/Voice/Last; trail: Goal/Done/Failed/Next plus a hallucination guard)
Voice remediation hints now point to Settings (RuntimeDepsService.enable_voice()) instead of the installer; 3 stale docs updated
MagicMock-saturation scope closed: 2 surviving test files rewritten against the real stack with zero business-logic mocks; pytest -m unit 1091 passed