June 8, 2026

Ollama cleanup lands, markdown escape hatch added

TKT-846 closed out its leftover smell: services/ollama_service

TKT-846 closed out its leftover smell: services/ollama_service.py was deleted (net -131 LOC). The file had become a helper bag straddling two concerns — _validate_model/_validate_host (one real prod caller, OllamaClient.init) and duplicated _parse_chat_response/_ollama_convert_messages whose only importers were two test files. The validators moved into llm_clients/ollama.py alongside their sole consumer, with the CodeQL sanitisation-barrier comments intact; OllamaClient.init now calls them locally. Five stale test imports across test_tool_call_id_uniqueness.py and test_vision_converters.py were repointed at the real prod functions, assertions unchanged. A provider-layer re-audit (anthropic/openai/gemini/llm_service/vision_service) confirmed only ollama carried this smell; behaviour preserved across 1064 unit tests.

The ARCHITECTURE doc still named services/ollama_service.py as if it existed, so the stale reference was scrubbed to reflect that validators live in llm_clients/ollama.py and converters are no longer duplicated.

The gemini id-uniqueness test was carrying a banned mock that was also completely inert: it patched services.llm_clients.gemini.time.time around two _accumulate_part calls, but gemini.py mints ids via uuid4().hex[:8] and never calls time.time() in _accumulate_part (time is only used for send() latency at lines 280/288). The with-patch wrapper and unused import are gone, the stale docstring now describes the real uuid4 contract, and the uniqueness assertion still drives the real production _accumulate_part end-to-end.

Models occasionally leak markdown despite the HTML-only system prompt, so markup.markdown_to_html() now provides a heuristic inline rewrite — bold, italic, under, code. Code spans are masked first so inner markers survive, and snake_case is guarded against false-underline matches. It fires at the single point the final response leaves the ACT loop (MessageProcessor._format_final_response), ahead of _record / write_assistant_row / post-turn hooks and the api-layer sanitize(), and is gated on broadcast_to==‘user’ so background channels (DMN, encoders, compaction) that emit JSON or plain text are never mangled. Four feature tests drive the real _loop→_record path with only the LLM boundary stubbed; 48 markup+message_processor regression tests green.

  • Deleted services/ollama_service.py (net -131 LOC); moved ollama validators into llm_clients/ollama.py with sanitisation-barrier comments intact

  • Repointed 5 stale test imports at the real prod functions in services.llm_clients.ollama

  • Removed inert banned time.time mock from gemini id-uniqueness test; uuid4().hex[:8] path is now the real contract

  • Added markup.markdown_to_html() heuristic fallback (bold/italic/underline/code) applied at _format_final_response

  • Gated on broadcast_to==‘user’ so JSON/plain-text background channels (DMN, encoders, compaction) are untouched

  • Unit gate: 1064 pass, 4 new feature tests, 48 markup+message_processor regressions green