Zero-Config Voice, Cognitive Legibility, and a Testing Blitz

Zero-Config Local Voice

A huge piece of work landed today: a built-in, local voice service that runs entirely within Docker. We’ve replaced the old pattern of configuring external TTS/STT endpoints. Now, voice capabilities are simply available when the container is running—no settings, no API keys.

Getting this to run reliably involved some necessary tweaks. We’re now using a smaller default Whisper model (base) that fits comfortably within a 4GB Docker VM, with an option for users with more RAM to scale up. We also fixed the build to pull the correct dependencies, increased proxy timeouts to accommodate slower CPU-based transcription, and updated the default TTS voice to Jasper to better match the project’s brand.

Making Chalie’s Thinking Visible

A core theme today was “cognitive legibility”—making it obvious what Chalie is doing under the hood. The main expression of this is a new “persistent task strip” in the chat UI. When Chalie is working on a background task, a small status area appears showing the goal, progress, and a summary. This was a two-part effort: building the UI component and then fixing a bug in the event pipeline that was preventing the progress updates from reaching the frontend.

In the same spirit, the Brain dashboard got a major overhaul. The single “Cognition” tab has been expanded into six distinct views (Jobs, Thinking, Memory, Tools, Identity, Working On) that surface Chalie’s internal state in a human-readable format. This work, combined with smaller improvements to feedback clarity, aims to demystify the agent’s actions and decisions.

Smarter Tools with OAuth and Learning Loops

The tool system got two major upgrades. First, we’ve added OAuth support, which is a foundational piece for securely integrating with third-party tools that require user-delegated permissions.

Second, and more importantly, we closed the performance feedback loop. We fixed a couple of silent bugs that were preventing tool performance data from being recorded. Now, that data is wired all the way back into the tool selection process, allowing Chalie to learn from experience and evolve its strategies over time rather than relying on static logic.

Massive Test Suite Expansion

Finally, we invested heavily in stability and quality assurance. We methodically grew the test suite from a freshly stabilized baseline of 542 to a final count of 945 passing tests. This involved adding comprehensive coverage for workers, services, and API endpoints. As part of this, we also made a strategic decision to remove over 100 tool-specific handler tests. The framework is tool-agnostic, so our tests should focus on the core infrastructure, not the implementation details of individual, encapsulated tools.