June 17, 2026

Search gets parseable format; docstring cleanup epic wraps

The search ability got a major redesign (TKT-1067)

The search ability got a major redesign (TKT-1067). Results now render as delimited, indexed XML records with score and date attributes, full untruncated summaries, and boundary-defanged content so a model can parse and cite them reliably. A pure formatter (tools/search/render.py) replaces the minified JSON body, and relevance ranking is now real per-result embedding cosine via the EmbeddingService, sorted best-first with fail-open. Hybrid enrichment is bounded to blank-summary results, post-cut, concurrent, and SSRF-guarded via web_fetch.fetch_text. The schema simplifies to query-only — auto-routing and a global top-5 are now always on, and the provider/limit knobs are gone. All rich-media and image-search code is deleted: _build_rich_card from search.py and news.py, the og_image_service and image_candidate_service (with their tests), and image extraction in tools/search/transformers.py. Feature tests cover render formatting, real EmbeddingService ranking, image-key removal, and a full ToolDispatcher → SearchAbility integration — all with zero mocks.

The web_search/web_browse delegate tools got three sharp fixes. The goal↔query synonym confusion is healed via the key healer’s VARIANTS ladder instead of a hand-rolled delegate_goal() helper, scoped tightly to web_browse (the only tool declaring ‘goal’) so the 14 tools that declare ‘query’ are untouched. Each ability now reads its own param through the blessed Ability.param() accessor keyed off its own Keys constant — no shared cross-ability extraction, clean SRP. Descriptions now disambiguate: web_search researches online via search engines, web_browse spawns a subagent with full browser control to act on specific sites. The abilities.sqlite + abilities_sha.json are rebuilt deterministically.

The docstring-cleanup epic (TKT-994) closes out across roughly two dozen batches (TKT-1016 through TKT-1041), trimming docstrings across services, tests, and the broader codebase. The automated qwen agent had a rough patch — empty-while-body syntax errors, unterminated strings, stray em-dashes, partial edits on timeout — so each broken file got reverted and re-run via a sonnet repair pass. The final file in the epic was folder_watcher_worker.py (TKT-1041), a sonnet repair of an emptied while-loop body, AST-gated SAFE.

  • Search results render as delimited XML records with full summaries and boundary-defanged content, parsed by tools/search/render.py

  • Real per-result embedding-cosine ranking via EmbeddingService, sorted best-first with fail-open; global top-5 after merge

  • Rich-media and image-search code fully deleted: _build_rich_card, og_image_service, image_candidate_service, and image extraction in transformers.py

  • web_search/web_browse disambiguated via key-healer VARIANTS ladder (scoped to web_browse), per-ability param reads via Ability.param(), and clear description split: lookup vs browser subagent

  • Docstring-cleanup epic (TKT-994) closes across ~24 batches with AST safety gate SAFE on all delivered files

  • Several batches required revert+sonnet-repair after qwen timeouts or syntax errors (empty while bodies, em-dashes, unterminated strings, partial edits), all ultimately SAFE