March 13, 2026

Major Refactor: Killing Memory Chunker

The memory chunker worker has been removed, significantly reducing codebase size by approximately 3,000 lines

The memory chunker worker has been removed, significantly reducing codebase size by approximately 3,000 lines.

Traits extraction is now handled via a dedicated asynchronous LLM call using a simplified prompt, utilizing a smaller 4B model.

Communication style metrics are calculated deterministically by the StyleMetricsService, requiring zero LLM usage and taking about 1ms.

Gists and Facts have been eliminated from the pipeline; raw conversation turns in working memory now serve as the replacement.

Emotion and Scope extraction are simplified; emotion is extracted independently by the episode worker, and boundary detection plus idle signals replace scope-based consolidation triggers.

The trait service has been streamlined, dropping parameters like source and is_literal, and collapsing trait categories from eight to three.

The episodic memory pipeline now triggers observation based on a minimum turn count of five, instead of relying on signal density from memory chunks.

This refactor updated 71 files across various components and maintained 1549 passing tests.

  • Migration 002 fixes schema matching issues on fresh installs, resolving sqlite3.OperationalError.

  • Migration 007 drops unused semantic_schemas and cognitive_reflexes.

  • Removed dead table definitions and unused columns from schema.sql.

  • Added bounds checking on label/logit array indexing in onnx_inference_service.py.

  • Added cognitive_drift:* glob pattern to ensure legacy MemoryStore keys are cleaned during data deletion.

  • Added embeddings mock to health check test for /system/ready endpoint.