← All updates
2026-05-11 Risk note Risk

OpenClaw’s newest risk cluster is not model quality — it is session hygiene

The most useful post-beta OpenClaw update is a cluster of session-integrity reports. Issue #48241 says quick successive messages can abort an in-flight run but still write an assistant reply filled with unrelated stale buffer content, with token counts that do not match the visible text; PR #48283 adds filtering so aborted assistant messages are discarded instead of delivered. A parallel compaction thread warns that long-lived sessions near 94% context can accumulate tool-failure loops, stale reminders, and weak user grounding; PR #48350 adds a post-compaction validator to check goal retention, pending-item retention, stale-system promotion, failure collapse, and reset recommendations. Two adjacent requests make the pattern clearer: image history needs timestamp/position metadata so old screenshots do not override newer text, and secure chat mode would let sensitive sessions opt out of transcript, summary, and memory persistence.

ImpactRisk Sources3 Audienceoperator · developer · team
Why it matters

For always-on agents, correctness is not only whether the model can answer. The runtime must keep the right conversation, the right user request, and the right memory boundary attached to each response. Stale buffer delivery, polluted compaction, old-image confusion, and unavoidable persistence are all different faces of the same product risk: a personal agent that sounds confident while losing session truth.

Evidence
  • Issue #48241 reports aborted runs writing unrelated assistant content, with upstream logs showing no matching model request and token counts inconsistent with the visible response
  • PR #48283 identifies stale streaming buffer content after abort and adds logic to strip aborted assistant messages before they are written or delivered
  • Issue #48238 describes a 256k/272k saturated session with repeated tool-failure loops, stale reminder/system reinjection, and recovery requiring transcript backup plus reset
  • PR #48350 adds a pure post-compaction validator with tests for goal retention, pending-item retention, stale-system promotion, failure collapse, compaction evidence, and conservative reset recommendation
  • Issue #48321 asks for timestamp and position metadata on images in threaded history so models can tell old screenshots from current state
  • Issue #48159 asks for per-session secure chat mode that avoids transcript, summary, compaction, and memory persistence for sensitive conversations
Risk notes
  • Several fixes are still open PRs or feature requests rather than a tagged release
  • The failure mode is hard to spot because stale or contaminated text can look fluent and plausible
  • Compaction validators can detect quality problems, but operators still need clear reset, backup, and user-notification workflows
  • Secure chat mode is currently a request, so users should not assume sensitive conversations are non-persistent