June 12, 2026
Popular providers, durable tool-calls, fact extraction
Added a 'Popular Providers' tab to the Add Provider dialog, sourced from the models
Added a ‘Popular Providers’ tab to the Add Provider dialog, sourced from the models.dev catalog with 111 OpenAI-compatible presets. The backend auto-injects host URLs from catalog data on create, and all catalog providers route through the existing OpenAIService. A new GET /providers/catalog endpoint serves the catalog, and badge-catalog CSS handles dynamically-registered provider platforms.
Made tool-call recording fully durable and dropped the turn-end ephemeral purge that was orphaning rich-media span tags before chat.py could build segments. A boot migration drops the tool_calls.ephemeral sentinel, _purge_ephemeral_tool_calls is gone, and ToolCallService is deleted (zero production callers). The 25k count-based purge is replaced with a 7-day time-based janitor in DecayEngineService. History replay no longer enriches previous messages with tool calls, and review_tool_calls now filters narration rows via a shared NARRATION_TOOL constant promoted in message_processor.
Fixed a live-broadcast rich-media regression: _broadcast_turn_result was looking up the turn’s transcript id via get_recent(‘user’, limit=1), but by broadcast time the assistant reply is already persisted, so the lookup returned the assistant row while tool_calls key to the user-input row — segment fetch came back empty, span tag orphaned, rich cards degraded to plain text on the live path. Now resolves backwards from the assistant row via resolve_tool_call_transcript_ids, the same shared function the /conversation/recent refresh path uses. A new feature test drives _broadcast_turn_result against a real DB with the assistant row persisted first.
Inserted a fact-extraction step into the subconscious worker between consolidate and decay (eight steps total, documented in docs/04 with decay/pattern/DMN step references renumbered). The step drains episodes with facts_extracted_at IS NULL oldest-first at a hard 20-LLM-call budget per tick; per episode the model sees the gist plus top-10 similar data_graph facts and emits constrained ADD/UPDATE/DELETE/NOOP ops via one prompt shared across all tiers. Unparseable output is a counted NOOP with a WARN — never a write. Bi-temporal supersession now stamps the old row’s valid_to from the same instant as the new row’s valid_from (single now_iso, equality by construction), alongside active=0 and the rw*0.5 fast-decay tombstone. New DataGraphService.upsert_fact and invalidate exact-key methods carry the op pipeline. 7 zero-mock feature tests drive the step end-to-end; gate 1439/0/151.
Recalibrated the turn-zero flashback gate, also documented in docs/04: _CONTINUATION_SIMILARITY_THRESHOLD moves 0.82 → 0.55 (corpus median for gte-modernbert-base, cross-linked to embedding_service._MODEL_ID; measured on-topic 0.70-0.89, topic shift 0.28-0.38). A new _is_unresolvable_terse rule treats terse messages with no living-doc Now section in a running conversation as continuation (skip), while terse first messages still fire. _EPISODE_OVERFETCH_FACTOR names the supers-preferred over-fetch multiplier. 6 feature tests pin the session-start/continuation/topic-shift/terse/curated-render/explicit-memory-recall contract end-to-end (zero mocks, real dispatcher + DB). Separately, Operational Principle 4 in the prompt mandates a current-turn tool call for any time-sensitive fact and declares prior-turn answers stale, to counter fabricated weather/news answers.
-
111 OpenAI-compatible provider presets from models.dev served via new GET /providers/catalog; new ‘Popular Providers’ tab in Add Provider dialog
-
Durable tool-call recording: ephemeral flag dropped via boot migration, ToolCallService deleted, 25k count purge replaced with 7-day time-based janitor in DecayEngineService
-
Live-broadcast rich-media fix: resolve_tool_call_transcript_ids replaces get_recent(‘user’) lookup that returned the assistant row after persistence
-
New fact-extraction worker step between consolidate and decay with 20-LLM-call budget, constrained ADD/UPDATE/DELETE/NOOP ops, and bi-temporal supersession (valid_to = new valid_from, single now_iso)
-
Turn-0 flashback gate recalibrated 0.82 → 0.55; _is_unresolvable_terse skips terse continuations without a Now section
-
Operational Principle 4: time-sensitive facts require a tool call this turn; prior-turn answers declared stale