2026-05-11 Risk note Risk

OpenClaw’s newest risk cluster is not model quality — it is session hygiene

The most useful post-beta OpenClaw update is a cluster of session-integrity reports. Issue #48241 says quick successive messages can abort an in-flight run but still write an assistant reply filled with unrelated stale buffer content, with token counts that do not match the visible text; PR #48283 adds filtering so aborted assistant messages are discarded instead of delivered. A parallel compaction thread warns that long-lived sessions near 94% context can accumulate tool-failure loops, stale reminders, and weak user grounding; PR #48350 adds a post-compaction validator to check goal retention, pending-item retention, stale-system promotion, failure collapse, and reset recommendations. Two adjacent requests make the pattern clearer: image history needs timestamp/position metadata so old screenshots do not override newer text, and secure chat mode would let sensitive sessions opt out of transcript, summary, and memory persistence.

ImpactRisk Sources3 Audienceoperator · developer · team

Why it matters

For always-on agents, correctness is not only whether the model can answer. The runtime must keep the right conversation, the right user request, and the right memory boundary attached to each response. Stale buffer delivery, polluted compaction, old-image confusion, and unavoidable persistence are all different faces of the same product risk: a personal agent that sounds confident while losing session truth.

Evidence

Issue #48241 reports aborted runs writing unrelated assistant content, with upstream logs showing no matching model request and token counts inconsistent with the visible response
PR #48283 identifies stale streaming buffer content after abort and adds logic to strip aborted assistant messages before they are written or delivered
Issue #48238 describes a 256k/272k saturated session with repeated tool-failure loops, stale reminder/system reinjection, and recovery requiring transcript backup plus reset
PR #48350 adds a pure post-compaction validator with tests for goal retention, pending-item retention, stale-system promotion, failure collapse, compaction evidence, and conservative reset recommendation
Issue #48321 asks for timestamp and position metadata on images in threaded history so models can tell old screenshots from current state
Issue #48159 asks for per-session secure chat mode that avoids transcript, summary, compaction, and memory persistence for sensitive conversations

Risk notes

Several fixes are still open PRs or feature requests rather than a tagged release
The failure mode is hard to spot because stale or contaminated text can look fluent and plausible
Compaction validators can detect quality problems, but operators still need clear reset, backup, and user-notification workflows
Secure chat mode is currently a request, so users should not assume sensitive conversations are non-persistent

Verdict

Risk

Next: If you run OpenClaw in Telegram, Discord, Slack, or long-lived browser sessions, treat rapid-fire user messages, abort/retry flows, and near-full transcripts as operational test cases. Watch for replies that look fluent but unrelated, compare visible output against token/log metadata, back up and reset contaminated sessions rather than repeatedly compacting them, and delay any sensitive workflow until ephemeral-session behavior is explicit. Developers building on OpenClaw should add regression tests for aborted streams, post-compaction state retention, and image recency in thread history.

Products

OpenClaw

Sources

GitHub issuesGitHub pull requests

github.com/openclaw/openclaw/issues/48241 github.com/openclaw/openclaw/pull/48283 github.com/openclaw/openclaw/issues/48238 github.com/openclaw/openclaw/pull/48350 github.com/openclaw/openclaw/issues/48321 github.com/openclaw/openclaw/issues/48159

Related reading

OpenClaw issue flags an internal-planning text leak in channel output

OpenClaw beta 4 shifts the May 12 story to agent boundaries and channel survival

OpenClaw 2026.5.9 beta widens the agent runtime surface before the next stable cut