June 6, 2026

Path-only uploads, vision pipeline, ability getter refactor

Killed the base64 act-trail bloat that was blowing the context window

Killed the base64 act-trail bloat that was blowing the context window. document.upload now takes a path (~50 chars), never bytes — ingest_file is the single mechanical, path-only ingest shared by both upload surfaces, and upload is dispatched by code only (kept out of the LLM INPUT_SCHEMA). /chat became multipart/form-data with raw files staged to temp paths, the standalone /upload endpoint was deleted, and the composer now holds raw File objects and POSTs them multipart — no pre-upload round-trip, no spinner. The high-deliberation thinking pass was also moved to fire AFTER the upload barrier in _seed_turn_zero (a→b→c), so its snapshot already carries each upload’s act-trail row.

Chat attachments now persist across page refresh via a new transcript_docs(transcript_id, doc_id) M2M table with ON DELETE CASCADE; transcript_service.link_transcript_doc is idempotent INSERT OR IGNORE, scoped to the user-attachment seed point only (a model-issued document.upload mid-turn does not render as a user attach). get_recent_history batch-joins through transcript_docs→documents (soft-deleted filtered) and serves msg[“attachments”] with /documents//preview URLs.

Built out the vision delegate framework (TKT-838): abilities/vision.py with a get_image hook and _resolve vision branch, vision registered as a delegate tool with chat:vision in policy defaults, and turn-0 attachment uploads now fan out at a ThreadPoolExecutor barrier. Image extraction routes through describe_image() (vision description + OCR fallback); textless images with no vision provider persist ‘ready’, not ‘failed’; upload results carry id=/hash=. A NULL file_path guard plus a dropped-OCR log line harden the vision path.

Refactored the Ability contract (TKT-837): the 5 metadata ClassVars (NAME/SUMMARY/EXAMPLES/SEARCH_TOOLTIP/INPUT_SCHEMA) became 5 zero-arg abstract getters, and @typing.final get_input_schema() is now the single assembler for the sealed framework fields (act_summary required, async iff SUPPORTS_ASYNC, on a deepcopy). mp is constructor-injected so getters can read self.mp and fall back to deterministic base text at build/index time; _MCPAbility implements the getters from its remote schema so MCP tools get framework-field parity with native abilities. The TKT-833 thinking pre-pass now mirrors the parent turn exactly — verbatim user prompt (Previous Messages history included) plus a snapshot of parent.active_tools, with the deliberation system prompt as the sole delta, parent-config-agnostic.

Smaller fixes: composer context indicator oscillated because usage_class=‘chat’ also matched thinking and delegate sub-requests — keyed off job_name=‘user:user’ for stability, and .dock-controls widened to 100% at the ≤640px breakpoint to match the input box. read() now resolves source/url/path/file aliases and emits a diagnostic error echoing received keys + a hint when source is missing, so the model self-corrects instead of looping. Delegate ProcessorConfigs (VisionConfig, WebSearchConfig, WebBrowseConfig) moved out of their ability files into configs/channels/, imported directly by each ability.

  • document.upload is now path-only (no base64 in act-trail); /chat is multipart/form-data; /upload endpoint deleted

  • transcript_docs M2M link table persists chat attachments across refresh via /documents//preview URLs

  • Turn-0 attachment uploads fan out at a ThreadPoolExecutor barrier; vision delegate registered with chat:vision policy default

  • Thinking pre-pass moved to fire AFTER upload barrier; mirrors parent turn (verbatim user prompt + active_tools snapshot)

  • Ability metadata switched from 5 ClassVars to 5 zero-arg abstract getters with a sealed @typing.final get_input_schema() assembler

  • read() accepts source/url/path/file aliases and returns a diagnostic error listing received keys; composer context indicator keyed off job_name=‘user:user’