Cognitive Reflexes and Structured Task Planning

Building Automaticity and Deeper Planning

Today’s big push was on evolving the core cognitive architecture. The most exciting new feature is a ‘cognitive reflex’ system that mirrors human automaticity. For simple, repeated queries, the system now builds semantic clusters of user intent. Once a cluster has enough evidence of success (and evidence that the full cognitive pipeline was overkill), future similar queries take a learned ‘fast path,’ using a lightweight LLM call to respond instantly. This is a huge step towards making the system feel faster and more efficient, saving expensive processing for novel problems.

In parallel, we’ve overhauled how persistent tasks are managed. Instead of a simple, reactive action loop, tasks can now be decomposed by an LLM into a structured Directed Acyclic Graph (DAG) of steps. This allows for genuine, multi-step planning and execution, moving us closer to agents that can tackle more complex, long-term goals.

Finally, we refined the learning mechanism. The reward system is now more nuanced, distinguishing between a tool failing on its own and failing due to an external factor like a rate limit. External failures now incur a much smaller penalty, ensuring the system doesn’t unfairly punish a useful tool just because an upstream API was temporarily unavailable.

Routing and Input Hygiene

Work continued on the ‘front door’ of the system where incoming messages are triaged. We completed a significant refactor of the message routing service, deduplicating how we collect NLP signals and simplifying the logic for choosing a processing mode. This not only makes the code cleaner but also adds better traceability by tagging each routing decision with its source.

We also squashed a bug where the LLM would occasionally respond with JSON-formatted text instead of natural language. The cause was simple: our internal tool markers in the action history ([TOOL:name]...[/TOOL]) were being passed to the model, which it then mimicked. Stripping these markers from the context before the LLM call immediately cleaned up the output.

Stability and Maintenance

We tracked down and fixed a critical bug that was causing HTTP 500 errors on all observability endpoints. A raw psycopg2 connection object was being used incorrectly throughout our system status modules. Correcting the database access pattern to use a proper cursor has restored full visibility into the system’s health and memory usage.

We also shipped fixes to handle API rate limits more gracefully and to prevent the system from scheduling duplicate background tasks. As a minor bit of housekeeping, we’ve deprecated gemini-2.0-flash.