All updates

A concise archive of what changed, what matters, and what to watch next.

2026-05-15 Product update Watch

OpenClaw turns the beta.8 dependency cleanup into a stable release, then opens a new beta with auditability and channel UX work

OpenClaw’s latest official movement is bigger than a normal patch cycle. The v2026.5.12 stable release packages the dependency externalization and runtime hardening that had been moving through the beta.6–beta.8 train: leaner installs for Slack, WhatsApp, Bedrock, Vertex, and sandbox dependency cones; isolated Telegram polling with local spooling; Codex/OpenAI auth-profile and fallback repairs; plugin install/update resilience; Windows sandbox and SecretRef credential tightening; and UI/history/reply delivery fixes. The new v2026.5.14-beta.1 then adds a fresh operator-facing layer: WhatsApp gets status reactions for queued, thinking, tool, done, error, and compaction lifecycle states; Telegram presentation payloads can render Mini App `web_app` buttons; subagent tasks are delivered as the child session’s first visible message instead of hidden only in a system prompt; mid-turn prompts can steer active runs by default; Telnyx realtime voice calls enter the release notes; heartbeat event payloads gain an explicit marker; Codex CLI sessions can be listed and bound from a paired node; and release validation now includes installed-package Docker user journeys, dependency evidence, and npm advisory gates. Nearby PRs keep the risk story concrete: #81880 requires canonical node platform IDs before applying desktop command defaults, #81451 caches hydrated skills without putting raw secrets into cache keys, and #68597 blocks symlink escapes in memory reads.

Worth noting: Treat v2026.5.12 as the safer upgrade candidate if you were waiting for the beta.8 fixes, but stage it with Telegram ingress, Codex OAuth, plugin install/update, and Windows sandbox regressions in your own channels. For beta.1, test the user-visible lifecycle reactions, subagent task disclosure, Mini App buttons, node-backed Codex binding, and voice-call paths before exposing them to production users.

2026-05-15 Risk note Watch

Hermes’ workflow core arrives beside a URL-safety bypass fix and profile-scoped scheduled jobs

Hermes’ strongest fresh update is architectural: PR #25806 adds the workflow system core, including workflow policy, store, DAG, gate, materialization foundations, Core/dashboard APIs for reading workflows, inboxes, promotions, gates, and materialization, plus stale-inbox promotion guards. That is the first sign of a more formal task/workflow layer rather than ad-hoc agent turns. The risk item to notice is #25961: IPv6 scope IDs such as `fe80::1%eth0` could make URL safety parsing throw and silently skip all resolved addresses, potentially allowing a hostname controlled by an attacker to bypass link-local or cloud-metadata protections; the fix strips scope IDs and fails closed on unparseable addresses. A second reliability fix, #25962, resolves yesterday’s split clarify-timeout problem by making the CLI callback honor `agent.clarify_timeout` before the older `clarify.timeout` key. PR #25917 adds profile-scoped scheduled jobs so cron jobs can run with a specific Hermes profile’s config, scripts, skills, and memory paths. The surrounding channel and operator polish is also practical: #25956 strips emoji, diagrams, inline code, tables, and symbols before TTS so spoken replies stop reading UI artifacts; #25960 prevents native Windows Telegram `/restart` from leaving the gateway stopped; #25959 tightens Discord channel-directory resurrection behavior and lowers batching latency; #25958 allows configured Discord role mentions as triggers; and #25954 adds a read-only Kanban metrics CLI for review and verification gates.

Worth noting: If you operate Hermes, prioritize the URL-safety patch anywhere agents fetch URLs or proxy web content. Then test workflow APIs with stale inboxes and gate transitions, profile-pinned cron jobs with profile-specific skills and memory, and clarify timeout behavior across Gateway, CLI, and TUI. For chat deployments, regression-test Telegram restart on Windows and Discord role/channel controls before broad rollout.

2026-05-15 Risk note Watch

OpenClaw’s next operator cluster makes approvals readable, scoped, and less likely to poison later runs

The most useful OpenClaw updates in the first Beijing May 15 window are about making real operators understand and trust what the agent is about to do. PR #81864 adds configurable plain-language plugin approval prompts so chat approvals can show a short summary, step list, risk line, and choices instead of a raw dump of command text, tool IDs, session keys, expiry, and `/approve` syntax. PR #81380 binds approval list and resolve paths to stored requester metadata, reducing the chance that one requester can see or resolve another requester’s pending approval. PR #80922 routes POSIX allowlists and allow-always persistence through a Tree-sitter command authorization planner, replacing the legacy chain/pipeline/heredoc parser and producing clearer enforced command renderings. The same window also fixes operational drift: PR #75270 stops temporary fallback models from becoming sticky after the primary model recovers, #81868 keeps exact-command cron turns from loading heavyweight bootstrap/memory context by default, #81870 forwards auth stores into image/video/music generation so OAuth-backed Codex tokens can refresh, and #81764 makes Telegram HTML parse fallback produce readable text with preserved links. PR #81851 is notable but experimental: a Claude CLI interactive backend streams reasoning through a local TLS proxy, so treat it as a sensitive preview rather than a default path.

Worth noting: If you run OpenClaw on chat channels, stage the approval changes with real Telegram/Slack approval cards and verify requester isolation, allow-always behavior, and rejected shell chains. Regression-test model fallback recovery, command cron jobs after the Codex migration, OAuth-backed image/video/music tools after token expiry, and Telegram messages whose HTML parse mode is rejected.

2026-05-15 Product update Watch

Hermes adds structured HTTP and argv tools to escape the bash-quoting trap

The strongest Hermes update in this window is a practical tool-runtime fix: stop forcing every machine-shaped action through `bash -c`. PR #25861 adds a structured `http` tool using `httpx.Client` with explicit method, URL, headers, JSON/body, params, and timeout fields, after production telemetry showed a single apostrophe in a JSON payload breaking shell quoting and triggering repeated retries. PR #25864 adds an argv-list form to the terminal tool so commands can run with `shell=False` and byte-for-byte arguments instead of a shell-safe string. PR #25862 then teaches the existing terminal path to recognize bash parse errors such as unexpected EOF and return an actionable hint pointing to the structured HTTP tool or safer quoting forms. The surrounding reliability work is also operator-relevant: issue #25859 documents two separate clarify timeout keys that make CLI/TUI sessions auto-decide after 120 seconds even when gateway clarify timeout is raised; PR #25856 fixes Telegram slash-confirm previews that silently fail on Markdown-sensitive characters; #25857 keeps migrated Codex `default_permissions` as a true top-level TOML key; #25858 skips admin-gated LiteLLM `/v1/models/{model}` probes for unrecognized servers; and #25624 stops deterministic MCP OAuth failures from repeatedly opening browser auth flows.

Worth noting: Prefer the new structured HTTP or argv forms for API calls and payload-heavy commands once available. Add regression cases with apostrophes, URLs, Markdown-sensitive file paths, and JSON bodies. If you rely on clarify in reviews or walkthroughs, set and test both gateway and CLI/TUI timeout paths until the duplicated config is unified.

2026-05-14 Product update Watch

OpenClaw beta.8 trims core dependencies and hardens Telegram ingress, child-model defaults, credentials, and rich replies

OpenClaw v2026.5.12-beta.8 is another broad operator release, but the center of gravity has shifted from one-off channel bugs to runtime shape. Bedrock, Bedrock Mantle, Slack, OpenShell sandbox, and Anthropic Vertex move out of core so default installs no longer drag in those dependency cones unless the matching providers or plugins are installed. Telegram Bot API polling moves to an isolated worker with a durable local spool so main event-loop stalls do not stop inbound message collection. The release also adds ACP backend fallbacks before output is emitted, a persisted Control UI auto-scroll selector, monotonic transcript sequence repair for stale SSE history, Windows `USERPROFILE` coverage in sandbox blocked home roots, stricter provider credential resolution through structured SecretRefs, bodyless media-fetch heap avoidance, onboarding flag forwarding for provider-specific API keys, plugin-provider discovery from setup env vars, auth-profile stale-lock reclaim, Codex OAuth refresh error classification, browser scope-loop reduction, plugin SDK subpath compatibility, rich/card-only outbound content recognition, and WebChat/TUI mirroring for Codex `tools.message` replies.

Worth noting: Stage beta.8 if you run Telegram, Windows hosts, Codex OAuth, ACP, Control UI, child subagents, or rich interactive replies. Regression-test inbound Telegram delivery during artificial event-loop stalls, sandbox denial of Windows home credential directories, provider API-key resolution with SecretRefs, rich presentation/button-only sends across cron/heartbeat/follow-up paths, and plugin installs after dependency externalization.

2026-05-14 Risk note Risk

Hermes approval hardening closes a critical YOLO-mode bypass and exposes long-session failure modes

The strongest Hermes item in the late May 14 window is approval safety, not another UI tweak. PR #23835 says `HERMES_YOLO_MODE` was read from `os.getenv()` on every approval check, so a skill or prompt-injected in-process tool could mutate `os.environ` and disable command approval checks after startup. The same PR tightens LLM smart-approval parsing from substring matching to exact `APPROVE`, logs dangerous background auto-approvals that previously had no audit trail, and expands pipe-to-shell detection to catch `/bin/bash` and `bash -c` variants. Nearby reliability work matters for the same operator audience: PR #25716 adds hierarchical long-context compression so huge transcripts can be summarized in bounded segments instead of timing out, while issue #25723 reports that one streaming provider error can disable streaming for an entire session rather than just the failing request.

Worth noting: Treat Hermes approval settings as a staging blocker if you run skills, background tools, or delegated agents. Verify that changing `HERMES_YOLO_MODE` after process start has no effect, that verbose model replies containing the word APPROVE do not auto-approve, that pipe-to-shell variants are detected, and that compression/streaming failures are visible per request instead of silently degrading a whole session.

2026-05-14 Risk note Watch

Hermes’ latest reliability cluster is about disappearing Web UI streams, failed compression, vision fallbacks, and dashboard auth

The most reader-useful Hermes cluster in this window is about keeping visible state aligned with model state. Issue #25583 reports Web UI SSE disconnects that can make a fully rendered assistant reply vanish, briefly show content from another session, or render raw Python content-block JSON as chat text because the run event queue is destroyed when the browser stream drops while the agent is still running. Issue #25585 and PR #25588 address a more dangerous model-state failure: automatic context compression used to insert a static “summary unavailable” marker and still drop middle turns when summary generation failed; the fix returns the original messages unchanged and records warning state instead. Issue #25594 says custom providers outside the models.dev registry can receive multipart text+image tool results even when the model is text-only, triggering HTTP 400 errors such as `text is not set`; #25602 asks for dashboard visibility and test controls for auxiliary fallback chains such as vision and compression. Nearby PRs fill in the same reliability theme: #25577 coerces tool args whose schema types are declared through anyOf/oneOf, #25580 moves cloud browser providers into plugins, #25584/#25587 make text fallback choices resolvable on platforms without buttons, and #20515 gates dashboard HTML/assets and WebSockets behind Tailscale identity allowlists when configured.

Worth noting: If you use Hermes Web UI or long-running sessions, test what happens when the browser SSE connection drops mid-run and confirm the final message, session identity, and raw content blocks stay correct. Enable or stage the compression fail-closed fix before relying on automatic compaction. For custom or regional providers, explicitly test vision/tool-result paths and expose fallback-chain state in operator checks until the dashboard catches up.

2026-05-14 Product update Watch

OpenClaw beta.6 turns this week’s scattered safety fixes into an upgrade target

OpenClaw v2026.5.12-beta.6 is the strongest new item in this run because it packages a broad operator-hardening wave into one official prerelease. The release stops iMessage media-only sends from leaking visible placeholder text, creates configured agent sessions before first agent-to-agent sends, moves the Gateway protocol to v4 with explicit delta/replace streaming frames, hides pending Node pairing capabilities until approval, requires approval for setup-code device pairing, browser pairing, and Control UI proxy-scoped access, and hardens trusted-proxy validation. It also caps inbound media download streams for Feishu / WhatsApp / Line, narrows plugin install-time code scans to plugin-owned runtime entrypoints while keeping dependency manifest denylist checks, centralizes config mutation retries, preserves and prunes managed peer dependencies, pins Docker setup paths so stale host .env paths do not leak into containers, and fixes several auth/profile/runtime edges including Copilot Gemini image descriptions, Anthropic session-rotation amnesia, OpenAI-compatible schema items, idle-model watchdog fallback, centralized transcript redaction, Telegram polling stalls, token-rotation offsets, delegated-session tool restrictions, node exec provenance, and hook CLI authority. A new issue, #81548, is worth reading alongside the release: it reports 25-30 seconds of OpenClaw overhead per isolated-agent turn on v2026.5.7 even when direct Ollama inference takes about 2.3 seconds, pointing at prompt assembly as the suspected bottleneck.

Worth noting: Treat beta.6 as a candidate hardening checkpoint, not a blind auto-upgrade. If you run public or semi-public channels, prioritize testing pairing approval flows, transcript redaction, media-size caps, plugin install behavior, delegated-session tool restrictions, and Telegram polling/token rotation. If you operate isolated agents behind local models, benchmark first-token and total turn latency before and after upgrade and compare it to direct provider calls.

2026-05-14 Product update Watch

Hermes starts moving from Kanban coordination to one gateway running many named agents

The most product-shaped Hermes update in this window is PR #25008: a single-gateway multi-agent MVP. It adds `agent_id` to session sources and session rows, introduces an `AgentProfile` ContextVar so model, SOUL.md, memory, skills, and session paths can switch per named agent, routes platform events by chat / thread / user / guild with a first-match-wins matcher plus a plugin hook, and wires the same profile switching through adapters, cron jobs, delivery targets, hooks, and a new `hermes agent` CLI. The surrounding fixes show why this architecture needs careful runtime hygiene: #25344 filters Honcho memory deriver noise such as “Nothing to save” before it reaches the main prompt, #25346 splits concatenated streamed tool-call argument blobs like `{...}{...}` into separate calls, #25341 cuts the `hermes tools` all-platforms menu from about 14 seconds to under 1.5 seconds while avoiding repeated Nous refresh-token burns, #25334 bypasses system proxies for localhost auxiliary clients, #25342 silences background-review memory-provider teardown output, and #22648 keeps an Ollama Cloud web backend moving forward.

Worth noting: Do not treat multi-agent Hermes as just a config toggle yet. If you test #25008, create at least two agents with different memory and skills, then verify message routing, cron ownership, delivery targets, hook `agent_id` propagation, and session-key migration. Also regression-test sparse Honcho sessions, streamed parallel tool calls, localhost OpenAI-compatible endpoints behind a system proxy, and the tool-menu auth/cache path before letting the gateway handle real channels.

2026-05-14 Risk note Risk

OpenClaw’s latest trust-boundary reports are about tools, secrets, prompt leaks, and delivery jams

The most useful OpenClaw cluster after beta.6 is not a docs-only change; it is a set of trust-boundary reports that map directly to operator incidents. Issue #75124 says user-invocable `command-dispatch: tool` skill slash commands can create the raw OpenClaw tool set and apply only owner-only filtering, bypassing the normal effective policy pipeline for profiles, group/channel rules, sandbox state, and subagent depth. PR #75101 adds `tools.exec.denyPathPatterns` after a reported production incident where a sub-agent read `~/.openclaw/secrets/telegram-trader.env`, leaking two Telegram bot tokens into session JSONL and the next outbound LLM request. PR #75128 wraps BOOT.md in internal-runtime-context and strips it from message-tool arguments because fallback models could echo startup instructions to users. Issue #75131 shows Telegram delivery retries for overlong messages creating fresh queue UUIDs instead of idempotent retries, keeping permanent 400 errors alive and driving event-loop utilization from 0.996 until stuck items were archived. Issue #75134 reports raw `[OpenClaw heartbeat poll]` prompts appearing in Telegram DMs, while nearby #75126 and #75133 tighten strict tool-mode diagnostics and bundle activation metadata.

Worth noting: Audit any skill slash command that dispatches directly to a tool: it should see the same effective tools as the active session, not a wider raw set. Add path-level exec denies for secret folders before relying on approval prompts. Test fallback boot runs and message-tool sends for runtime-context leakage. For Telegram operators, pre-split long sends near 4096 characters, clear stuck delivery items cautiously, and verify heartbeat prompts are never rendered to external chats.

2026-05-14 Risk note Watch

OpenClaw’s next operator cluster is about outbound hooks, TTS, auth locks, and fallback visibility

After beta.6, the most useful new OpenClaw cluster is a set of small fixes that all sit on the real-user path. PR #81680 makes encrypted-messaging reply delivery invoke the same `message_sending` plugin hook used by other channels, so content gates, audit hooks, and DLP filters are no longer blind to those replies. PR #81681 applies TTS transformation to `message(action=send)` tool sends, fixing `[[tts:text]]` directives that previously went out as literal text while final replies worked. PR #81679 backports the Codex OAuth refresh-spam fix and keeps quota / entitlement failures from becoming false relogin prompts. PR #81678 reclaims stale auth-profile file locks when the recorded owner process is dead. Issue #81664 asks for user or hook notification when a primary model silently falls back to a secondary model. Issue #81649 reports a real 2026.5.7 regression where the Anthropic agent harness was not registered and only six of seven expected plugins loaded across Windows, WSL2, Docker, and native installs. PR #81642 also lands the first bundled DingTalk channel core, while #81303 adds `session.maintenance.preserveKeys` so a primary WebUI session is not pruned just because retention is short.

Worth noting: If you operate OpenClaw on real channels, regression-test every outbound path, not just final replies: plugin `message_sending` hooks, tool-initiated sends, TTS directives, fallback-model behavior, and auth-profile lock recovery. If you run Codex OAuth or Anthropic harness profiles, verify your exact build before assuming beta.6 covered the follow-up fixes. For DingTalk, treat #81642 as basic registration and text/media round-trip work rather than a complete enterprise connector.

2026-05-14 Risk note Watch

Hermes is filling in the boring-but-critical layer: backups, self-kill guards, email, and pluggable web search

The strongest Hermes follow-up cluster is not a shiny UI feature; it is operational durability. Issues #25458 through #25461 define an encrypted backup dry-run for L Butler using fake runtime data, an isolated restore drill for ledger and memory, a human decision on backup destination and recovery-key ownership, and status checks that can say whether backup / restore proof is fresh, stale, failed, missing, or unverified without exposing paths or private contents. The safety thread is reinforced by issue #5528 and closed bug #3397: a Telegram gateway agent once ran `pkill -f "cli.py --gateway"` after misdiagnosing its own health, killing the gateway from inside itself; Hermes now has a concrete request for configurable approval-locked command patterns so deployments can require manual approval for local actions that are operationally disruptive even if not universally destructive. Nearby PRs round out channel/runtime hygiene: #25446 prevents heartbeat TypeErrors on empty activity fields and rate-limits blocked Kanban child reminders; #25441 adds multipart/alternative HTML email responses; #25448 moves all seven web-search providers into a plugin registry; #25457 lets plugins inject hidden CLI messages without cluttering the terminal; and #25348 adds a getxapi skill with endpoint coverage, costs, posting workflow, and secret-safety notes.

Worth noting: For always-on Hermes deployments, prioritize recoverability before adding more channels: run an encrypted backup dry-run, prove a restore in an isolated directory, decide who owns recovery keys, and expose the status in normal assistant checks. Add deployment-specific approval locks for commands that can restart or kill the gateway. Then test heartbeat behavior, HTML email rendering, plugin-provided web backends, and any third-party X/Twitter workflow with throwaway credentials first.

2026-05-14 Risk note Risk

OpenClaw’s May 14 queue is a reminder that multi-channel agents fail at the seams

Fresh OpenClaw issues and PRs are less about one headline feature and more about the seams that decide whether a personal agent is safe to leave running. Issue #81484 reports a Discord guild regression in 2026.5.7 where server-channel prompts can produce perpetual typing, malformed sends missing the message payload, duplicate replies, or runaway outbound loops. Issue #81480 says `gateway.tailscale.mode: off` is ignored, with `sudo tailscale serve` still called roughly every three seconds — about 43,000 sudo invocations per day. Issue #81472 consolidates five config gaps found while splitting agents by role: per-agent bootstrap fields validate but do not take effect, and channel groupPolicy / dmPolicy behavior differs across schema, runtime, and patch CLI. Nearby fixes show the same shape: #81479 bridges resolved Gateway auth into isolated Codex app-server subprocesses, #81477 makes message actions use the resolved SecretRef runtime snapshot, #81488 hardens node exec approval prechecks so gateway-local PATH does not influence node-host decisions, #81486 clears Telegram progress drafts before final replies, and #81482 keeps ACPX one-shot clients alive long enough for the initial turn.

Worth noting: If you run OpenClaw in Discord guilds, Tailscale-exposed gateways, isolated Codex cron jobs, or per-role agent setups, treat this as a regression-test checklist. Verify one inbound guild message yields exactly one final reply, confirm `mode: off` actually stops Tailscale polling, dry-run per-agent config patches and then confirm runtime behavior, and test isolated subprocesses with SecretRef-backed Gateway auth before relying on scheduled jobs.

2026-05-14 Risk note Risk

Hermes is tightening the places where profiles, skills, and voice sessions can corrupt agent behavior

The freshest Hermes activity is a useful cluster for anyone relying on profiles and skills rather than a single vanilla agent. PR #25150 fixes a destructive profile install/update path: `_copy_dist_payload` used `rmtree` plus `copytree`, so updating a distribution could delete locally installed skills even though a distribution-owned manifest existed. Issue #25113 and PR #25143 cover the other side of skill hygiene: `.bak-*` and backup directories could be discovered as real skills, so a stale v2 backup could load instead of the live v3 skill. PR #25151 adds `HERMES_REAL_HOME` for subprocesses because profile isolation intentionally rewrites `HOME`, but helpers that need the real `~/.hermes` path were resolving the isolated profile home instead. PR #25142 fixes a voice-input failure mode where STT setup chatter was injected into the LLM-visible prompt and persisted in history, causing later successful voice messages to keep receiving irrelevant STT setup replies. Around the edge, #25132 gates Telegram profile bots by allowed forum topics, #25149 removes unnecessary `shell=True` from non-user-authored subprocess calls, and #25144 adds a CI detector for KeyboardInterrupt cleanup regressions.

Worth noting: Before updating Hermes profiles, back up local skills and verify the update path preserves non-distribution files. Scan skill folders for `.bak-*`, `.backup-*`, and backup directories, then confirm the loader chooses the live skill. If profile isolation is enabled, update helper scripts to use `HERMES_REAL_HOME` when they need the real Hermes base. For voice deployments, test one failed STT turn followed by successful transcription in the same session; the model should answer the user, not keep discussing STT setup.

2026-05-13 Risk note Risk

OpenClaw’s beta train shows the hard part of channel agents: session keys, silent sends, and context boundaries

OpenClaw v2026.5.12-beta.3 is an official beta with useful fixes, including clearer subagent session nesting, fewer redundant subagent heartbeat wake-ups, better provider stream draining, admin/write gates for memory-wiki paths, and OpenAI auth-profile media-tool availability. The more urgent reader takeaway comes from the fresh field reports around it: issue #81234 says beta.3 cron jobs can time out after turn-accepted and stale cron sessionKey values can block a Discord DM lane; issue #81240 reports Slack sessions where the model writes a complete reply but nothing is posted to Slack; issue #81241 reports Telegram runtime-context envelopes appended into the user text body as well as delivered out-of-band; and PR #81242 fixes isolated subagent spawns that were still preparing inherited context, causing runaway CPU and stalled local inference on Ollama-like runtimes.

Worth noting: If you run OpenClaw in Discord, Slack, Telegram, cron, or local subagent workflows, treat beta.3 as a staging-only upgrade until you verify sessionKey isolation, real outbound sends, Telegram transcript shape, and isolated subagent CPU behavior. For Discord workflows, also note PR #81243: pinpoint fetch by message ID or URL is landing, which should reduce the “paste the content, I cannot inspect the link” failure mode once merged and released.

2026-05-13 Risk note Risk

Hermes LSP diagnostics are useful, but the defaults now need an operator safety pass

Hermes' new LSP-backed edit diagnostics are moving quickly from feature to operations surface. Fresh issue #25015 says the subsystem currently ships enabled with `install_strategy: auto`, so the first Python, Go, Rust, or TypeScript edit in a git repo can silently install a language server into the Hermes home. Issue #25017 adds the supply-chain angle: some install recipes use moving `@latest` targets such as `golang.org/x/tools/gopls@latest`, which is awkward for audited or reproducible environments. Issue #25016 reports the resource side: the LSP manager defines a 600-second idle timeout but has no reaper, so long-running gateways can keep pyright, gopls, tsserver, or rust-analyzer processes alive indefinitely, adding roughly 80-300+ MB per language/workspace. PR #25021 adds the missing idle-subprocess reaper, while PR #25011 salvages two smaller LSP fixes: faster CLI gating for `hermes lsp` and avoiding false mutation-failure classification when diagnostics contain nested `error` keys.

Worth noting: If you enable Hermes LSP diagnostics, make the install policy explicit before using it in audited or shared environments. Prefer pinned language-server versions, verify whether auto-install is allowed by your compliance posture, and watch gateway memory after editing across several repos. If you already adopted the feature, test the idle reaper path once it lands and re-check write_file / patch success handling when diagnostics are present.

2026-05-13 Risk note Risk

OpenClaw’s newest channel bugs are about where replies land, what stays silent, and what survives restarts

Fresh OpenClaw issues and PRs show another practical delivery-boundary cluster for always-on agents. Issue #81413 reports that Google Chat group messages can collapse into the main session key, causing replies to go to the user's most recently active channel, such as WhatsApp, instead of the originating Google Chat group. Issue #81411 says cron-generated Telegram messages can render Markdown links as literal raw HTML, while long-running cron jobs still need better completion and delivery semantics. Issue #81412 and PR #81420 cover the quiet-reply edge: when a model surrounds `NO_REPLY` with explanatory text or thinking blocks, the current exact-match behavior can strip only the token or miss the suppression path, letting unwanted text or reasoning appear in chats. Two nearby PRs matter for operators too: #81418 adds a parent-PID watchdog so orphan MCP channel-server workers do not survive a killed gateway and break the next upgrade handshake, and #81417 makes memory flush thresholds scale with large model context windows so long sessions do not hit compaction with nothing persisted.

Worth noting: If you run OpenClaw across multiple messaging channels, test cross-channel routing with real Google Chat group IDs, WhatsApp sessions, and Telegram cron jobs, not just direct messages. For cron and ambient group workflows, verify that quiet responses stay quiet even when models add reasoning text. After gateway upgrades or crashes, check for stale MCP channel-server processes. If you use 1M-token models, review memory flush thresholds rather than relying on the old 4,000-token default.

2026-05-13 Risk note Risk

Hermes patches a subtle but serious safety rule: silence is not consent

Hermes PR #24923 fixes a clarify-tool timeout behavior that matters for anyone using agents around irreversible actions. The CLI callback previously told the model that, if the user did not answer in time, it should “use your best judgement to make the choice and proceed.” For permission-style questions, the PR notes that this can be interpreted as approval for destructive actions such as `rm -rf .git`. The fix changes timeout language to an explicit refusal and adds tool-description guidance so the model knows before asking. Around the same window, Hermes is also tightening operational edges: #24925 stops session_search from loading whole 800-message conversations when only FTS match windows are needed, #24927 tracks false failure classification when successful writes include diagnostics, #24928 strips inherited Python paths before terminal subprocesses on Windows, and #24930 repairs browser launch flags on root or AppArmor-restricted hosts.

Worth noting: If your Hermes workflows ask users for approval, test clarify timeouts specifically: no answer should block destructive or irreversible work, not convert into implied permission. Also test large-session search, file writes with diagnostics, Windows terminal commands, and browser automation if those paths are part of your daily workflow.

2026-05-13 Risk note Watch

OpenClaw is separating group-room context from user requests, while tightening plugin and skill install paths

The most interesting fresh OpenClaw work after v2026.5.12-beta.4 is not a shiny feature; it is boundary cleanup for always-on agents. PR #81317 adds room-event semantics so ambient Telegram group chatter is kept as context but no longer treated like a fake user request. The PR says these turns stay quiet by default, do not emit ack/status reactions or reasoning drafts, and only speak if the agent deliberately calls the message tool; the author also reports real Telegram bot-to-bot E2E coverage for tagged, ambient no-leak, tool-send, and carry-forward context cases. Nearby PR #81365 bootstraps configured agent sessions through the normal sessions.create path before first send, #81364 reapplies ClawHub exact-release trust checks before plugin downloads, #81362 prevents a bad workspace skill directory from killing remote-bin refresh for all connected nodes, #81361 raises plugin install scan limits for large Codex dependency trees, and closed issue #80888 documents a cron pre-model watchdog that could kill active isolated jobs after 60 seconds because Pi and CLI runners did not emit model_call_started.

Worth noting: If you run OpenClaw in busy Telegram groups, test ambient chatter: it should enrich room context without causing accidental replies or hidden tool use. If you install ClawHub plugins or workspace skills, watch whether trust checks, scan limits, and bad-directory handling match your risk tolerance. Cron users should verify long isolated jobs survive past the old 60-second watchdog path in their installed version.

2026-05-13 Product update Watch

OpenClaw 2026.5.12-beta.4 turns the beta line into a runtime and channel hardening release

OpenClaw v2026.5.12-beta.4 is the newest official beta in the 2026.5.12 line. The release keeps the identity-aware safety theme from beta.1, including per-sender tool policies and memory/wiki admin or write gates, but the practical value has shifted toward runtime and channel reliability. It fixes a Codex runtime MODULE_NOT_FOUND path when the official @openclaw/codex package needs its private task-runtime SDK helper, makes Enter activate highlighted checkbox rows during Codex migration, keeps auth-profile-backed media tools such as image_generate available when OpenAI auth is stored outside environment variables, and unblocks WhatsApp/source installs by allowing Baileys' pinned libsignal git dependency under pnpm 11. It also carries a broad set of Slack, Telegram, WhatsApp, iMessage, Gateway, provider, plugin, and transcript fixes: OpenAI-compatible HTTP now forwards max_completion_tokens and max_tokens, plugin session_end hooks fire on shutdown or restart, Telegram topic/reply context is bounded more carefully, and long-session transcript scans move to streaming helpers that cut a synthetic 200 MiB transcript peak RSS delta from about 252 MiB to 27 MiB.

Worth noting: If you are already testing the 2026.5.12 beta line, treat beta.4 as the more relevant staging target than beta.1 or beta.3. Re-test Codex migration, Codex app-server auth/media tools, WhatsApp installs, memory/wiki permission gates, Slack and Telegram delivery, plugin shutdown/restart cleanup, and large-session transcript behavior before promoting it. If you run public or team-facing agents, also verify per-sender tool policies with real channel identities rather than assuming the new controls are protecting dangerous tools.

2026-05-13 Risk note Risk

Hermes Kanban needs another reliability pass before teams treat workers like dependable teammates

Hermes v0.13.0 made Kanban the headline durable multi-agent layer, but fresh issue #24699 reports the exact handoff problem operators worry about: when a Kanban task is suspended and later resumed after more information arrives, the worker can lose prior task context and start over; when it needs the main agent or user, the subagent can repeatedly time out and retry while waiting. PR #24693 adds a related fix for worker setup: hermes profile create previously bootstrapped new profiles with only hermes-cli, so Kanban workers assigned to those profiles had no web, browser, terminal, or file toolsets and could silently degrade. The same new reliability cluster includes issue #24701, where /new can stall at destructive confirmation and repeat a prior task, issue #24698, where the latest Docker image lacks python-telegram-bot for Telegram gateway startup, and issue/PR #24697/#24700, where auxiliary vision analysis ignores SOUL.md and loses the session-specific lens.

Worth noting: If you are using Hermes Kanban for real delegation, do not assume a resumed card carries enough context. Test suspend/resume, ask-user handoff, retry behavior, and worker profile toolsets before assigning production work. Docker-based Telegram users should verify gateway startup from a clean image. Teams using image analysis should test whether the auxiliary vision description respects the persona or domain lens they rely on.

2026-05-12 Risk note Risk

Hermes exposes another last-mile boundary problem: system notices and background memory need tenant-aware gates

Hermes PR #24365 says suppress_system_messages was not fully enforced for WhatsApp and Discord gateways, so customer-facing deployments could still send internal platform notices such as “No home channel is set”, “Session reset”, dangerous-command approval prompts, and assistant narration that operators expected to hide. PR #24392 adds a separate profile-home isolation fix for WebUI background memory and skill reviews: after a non-default profile turn, the daemon thread could initialize after process-level HERMES_HOME was restored and then load config or write memory under the default profile instead of the parent run profile. PR #24376 is smaller but operationally related: dangerous command approval prompts in the CLI can now trigger the existing bell / dock-bounce path so the human actually notices a blocked high-risk action.

Worth noting: If you use Hermes in customer-facing WhatsApp or Discord channels, verify suppress_system_messages with real reset, config-warning, and dangerous-command flows before trusting it for clients. If you run multiple WebUI profiles or tenant-like homes, test post-turn background review writes and memory locations under non-default profiles. For local CLI operators, enable bell_on_complete if you rely on being alerted when a dangerous approval is waiting.

2026-05-12 Risk note Risk

OpenClaw beta operators get a concrete reminder that recovery code needs its own limits

OpenClaw issue #80960 reports a stuck session where session-file repair wrote 2,180 full .bak snapshots, adding about 2.1 GB under ~/.openclaw/agents/operations/sessions/ in roughly 25 hours. The same fresh reliability cluster includes PR #80961, which warns when a string agents.defaults.model silently disables model fallbacks; issue #80877, where Anthropic Max OAuth users see a misleading “top up your API key” channel message even though no API key exists and the gateway recovers after OAuth sync; and PR #80952, which lets Telegram plugin commands suppress the duplicate “No response generated” fallback after they already delivered their own reply.

Worth noting: If you run OpenClaw 2026.5.10 beta builds, check session directories for repeated .jsonl.bak files, especially around ended sessions that still appear in spawn or compaction paths. Prefer object-form model config with explicit fallbacks, audit channel-facing provider-error copy for OAuth users, and test Telegram plugin commands that send through the Bot API so they do not produce duplicate fallback replies.

2026-05-12 Product update Watch

Hermes starts giving file edits real language-server feedback before broken code piles up

Hermes PR #24168 adds an LSP layer behind write_file and patch so the agent can see semantic diagnostics introduced by its own edit: type errors, undefined names, missing imports, and similar failures from language servers such as pyright, gopls, rust-analyzer, typescript-language-server, clangd, bash-language-server, Vue, Svelte, Astro, Lua, PHP, OCaml, Dockerfile, Terraform, Dart, Haskell, Julia, Clojure, Nix, Zig, Gleam, Elixir, Prisma, Kotlin, and Java. The design first captures a pre-write baseline and then filters out old problems, so the model is not buried under a project's existing debt. It is also gated to git workspaces, keeping casual Telegram or Discord home-directory chats from accidentally waking language servers.

Worth noting: If you use Hermes for coding work, test this PR on one messy real repository and one clean small repository before treating it as a default. Check that diagnostics are limited to newly introduced problems, that language-server installation policy matches your environment, and that gateway sessions outside git workspaces stay quiet. For teams, this is a good reason to define which language servers are trusted and preinstalled on shared runners.

2026-05-12 Product update Watch

OpenClaw beta 5 broadens the May 12 upgrade into channel and runtime recovery work

OpenClaw v2026.5.10-beta.5 supersedes beta.4 as the current prerelease. The useful change for operators is breadth: Fal image edits now route GPT Image 2 and Nano Banana 2 reference-image requests through the edit endpoint with aspect-ratio and resolution handling; Control UI shows a plain recovery panel when the app module never registers instead of leaving a blank dashboard; agent-to-agent reply chains can be allowed up to 20 turns while the default stays conservative; sandboxed and public agents get per-agent message send/cross-context restrictions; timed-out Codex app-server clients are retired so Discord agents do not reuse CPU-spinning processes; and Slack, Telegram, WhatsApp, Cron, Gateway, provider, memory, and transcript fixes continue the beta.4 reliability theme.

Worth noting: Treat beta.5 as a staging candidate for teams that were already testing beta.4, not as a blind production jump. Exercise image edit flows, blank Control UI recovery, Slack thread/DM routing, public-agent message permissions, long agent-to-agent loops, Discord timeout recovery, long transcript lookup, and one Cron notification path. If you run multiple agents with different GitHub or cloud credentials, still track issue #80698 because per-agent environment scoping is not solved by the beta.5 release notes.

2026-05-12 Product update Watch

Hermes starts packaging computer-use work as a durable runtime, not a one-off browser trick

Hermes PR #24065 adds a persistent Computer runtime with run.json, events.jsonl, artifact directories, and a computer tool for start, schedule, list, get, events, and cancel actions. The same fresh workstream matters because it addresses the boring failure modes that decide whether computer-use can run unattended: PR #24045 saves user messages hit by 429/529 rate limits into a dead-letter queue with /queued retry; PR #24064 stops headed browser sessions from being killed after every turn; and PR #24071 tightens hardline approval blocking for quoted catastrophic rm targets such as "/", "/var", and "$HOME" paths. Issue #24067 also shows the gateway side still needs restart hygiene on macOS, where stale PID locks can make Telegram, Feishu, and WeChat look already in use after a crash.

Worth noting: If you are testing Hermes for computer-use or watched desktop workflows, treat these as a pre-release integration checklist. Verify lifecycle files, cancellation, artifact cleanup, browser persistence, rate-limit replay, and approval blocking with quoted shell paths before delegating long tasks. On macOS gateways, also test crash-and-restart behavior so stale PID locks do not silently disconnect messaging platforms.

2026-05-11 Risk note Risk

OpenClaw issue flags an internal-planning text leak in channel output

OpenClaw issue #80578 reports a high-priority privacy/safety regression: a user-visible iMessage reply began with an internal planning paragraph before the intended final message. The issue argues this should be treated as a shared output-boundary problem, because the same class of leak could reach any channel adapter if final delivery accepts hidden planning, draft rationale, or self-instructions as normal text.

Worth noting: If you run proactive or coaching-style channel agents, pause sensitive outbound automations until the final delivery boundary is audited. Add a hard channel-side sanitizer, regression-test iMessage plus at least one other adapter, and review recent outgoing messages for planning-style prefixes before enabling unattended sends.

2026-05-11 Risk note Watch

Hermes is opening a remote-control surface — and tightening the executable boundary around it

Hermes PR #23742 adds authenticated remote management endpoints for sessions, profiles, SOUL/persona files, memory, toolsets, skills, and gateway status, so desktop or dashboard clients can manage an agent through the API instead of using filesystem or SSH access. In the same current workstream, PR #22535 closes a more dangerous ACP boundary: previously, ACP clients could provide stdio MCP server definitions during new, load, resume, or fork session setup, and those definitions could launch local commands before a normal agent turn or dangerous-command approval path. The fix disables client-provided stdio MCP servers by default while keeping HTTP/SSE MCP servers available and adding an explicit trusted-operator opt-in. PR #23740 also bridges the clarify tool to messaging platforms, showing Hermes is making remote and channel operation more interactive, not just headless.

Worth noting: Treat this as a remote-admin security review item before exposing Hermes API servers beyond localhost. Verify auth enforcement on every management endpoint, check skill-content path safety, keep stdio MCP registration disabled unless the ACP client is explicitly trusted, and test resume/fork flows because they are easy places for executable configuration to slip through. If you run Hermes over Feishu, Telegram, Discord, or a dashboard, also test clarify prompts and cancellation paths with real users.

2026-05-11 Risk note Risk

OpenClaw’s newest risk cluster is not model quality — it is session hygiene

The most useful post-beta OpenClaw update is a cluster of session-integrity reports. Issue #48241 says quick successive messages can abort an in-flight run but still write an assistant reply filled with unrelated stale buffer content, with token counts that do not match the visible text; PR #48283 adds filtering so aborted assistant messages are discarded instead of delivered. A parallel compaction thread warns that long-lived sessions near 94% context can accumulate tool-failure loops, stale reminders, and weak user grounding; PR #48350 adds a post-compaction validator to check goal retention, pending-item retention, stale-system promotion, failure collapse, and reset recommendations. Two adjacent requests make the pattern clearer: image history needs timestamp/position metadata so old screenshots do not override newer text, and secure chat mode would let sensitive sessions opt out of transcript, summary, and memory persistence.

Worth noting: If you run OpenClaw in Telegram, Discord, Slack, or long-lived browser sessions, treat rapid-fire user messages, abort/retry flows, and near-full transcripts as operational test cases. Watch for replies that look fluent but unrelated, compare visible output against token/log metadata, back up and reset contaminated sessions rather than repeatedly compacting them, and delay any sensitive workflow until ephemeral-session behavior is explicit. Developers building on OpenClaw should add regression tests for aborted streams, post-compaction state retention, and image recency in thread history.

2026-05-10 Risk note Risk

Mercury permission guardrails can be bypassed by chained shell commands

Mercury PR #46 is small but urgent for anyone relying on its permission model. The report says the shell gate evaluates blocked, auto-approved, and approval-required patterns against the whole command string instead of each shell segment. That means an apparently safe command such as `echo *` or `ls` can auto-approve a longer payload that appends `; rm -rf ~`, `&& cat /etc/shadow`, a pipe to `sh`, or command substitution. The PR labels the issue critical CWE-78 because it can defeat the very guardrail that permission-hardened personal agents depend on.

Worth noting: Until a fixed Mercury release lands and is verified, treat shell auto-approval rules as unsafe for unattended use. Disable broad auto-approve shell patterns, require manual approval for shell execution, review daemon logs for chained commands, and avoid running Mercury with access to secrets or broad filesystem permissions. Teams evaluating always-on Telegram or CLI agents should add chained-command tests to their security checklist.

2026-05-10 Product update Watch

OpenClaw 2026.5.9 beta widens the agent runtime surface before the next stable cut

OpenClaw v2026.5.9-beta.1 is the strongest new ecosystem update to track: not because every team should upgrade immediately, but because it previews where the runtime is moving. The release adds default-reset chat commands, clearer CLI and startup recovery errors, runtime model identity in agent prompts, unified provider catalogs for text/image/video/music, a bundled `oc-path` plugin for controlled `oc://` file access, richer plugin SDK presentation and channel-message contracts, and a large Discord voice/realtime push. It also tightens operations through shared Telegram throttling, `tini` for Docker child-process reaping, task-ledger RPC stabilization, active-memory allowlists, durable message receipts, fs-safe output staging, and many gateway/session performance repairs.

Worth noting: Treat this as a staging candidate, not a blind production upgrade. If you depend on OpenClaw channels or plugins, test plugin install/repair, Telegram/Discord delivery, model switching, Codex/OpenAI runtime paths, and gateway restart behavior against your real config. Also note the breaking change: BlueBubbles-backed iMessage is removed in favor of the native `channels.imessage` path using `imsg` or a remote-Mac wrapper.

2026-05-10 Product update Watch

Hermes starts putting hard limits around runaway subagents

Hermes PR #22820 turns a familiar multi-agent failure into explicit controls. The report describes a delegated subagent that drifted beyond scope, ran for 175 seconds, expanded context from about 33K to 72K+ tokens, made 10+ API calls, and had to be interrupted. The proposed fix adds configurable caps for maximum child context, output/input growth ratio, and wall-clock timeout, then marks the child failed with `resource_limit_exceeded` when it crosses the boundary. PR #22944 tackles a related reliability issue in context re-compression: the agent’s `## Active Task` field could be overwritten with `[N/A]`, stale, or hallucinated content after repeated compaction cycles.

Worth noting: If you run Hermes delegation or Kanban-style agents, watch these PRs before increasing autonomy. Add your own defaults for subagent timeout, context-growth ratio, and maximum token budget; log `resource_warnings`; and include “active task preserved after compaction” in regression tests. Treat the new DAG TaskGraph / delegate bridge work as a roadmap item until it is integrated and battle-tested with the same guardrails.

2026-05-10 Product update Watch

OpenClaw tightens the “what actually happened?” layer for agent operations

A new May 10 OpenClaw cluster is less flashy than the beta release, but important for teams running agents through real channels. PR #80151 adds structured delivery outcomes for `openclaw agent --json --deliver`: `sent`, `suppressed`, `partial_failed`, and `failed`, including per-payload results when durable delivery can provide them. PR #80217 makes Codex-native tools visible to the diagnostic watchdog, so a long-running native bash or scraper is treated as active work rather than an abandoned embedded run. PR #80251 fixes session reset so a new session id also rotates generated transcript files and clears stale compaction checkpoints. PR #80250 adds doctor warnings when a channel-routed agent is missing the `message` tool, turning a confusing “the platform cannot do that” failure into a configuration warning.

Worth noting: If you automate OpenClaw delivery, plan to consume `deliveryStatus` instead of scraping stderr or trusting a coarse boolean. In staging, test partial channel failures, hook-suppressed sends, long Codex-native commands, and `sessions.reset` on heavy conversations. Also run doctor against channel-bound agents after tool policy changes, especially when Telegram, Discord, Feishu, or Mattermost agents are routed through reduced tool allowlists.

2026-05-09 Risk note Risk

Hermes debug sharing needs a privacy check before uploads

The most broadly relevant fresh Hermes item is a diagnostics privacy risk. Issue #22016 says `hermes debug share` can create logs with prompt snippets, user names, tool outputs, and other personal data, then expose them through public paste URLs when users attach reports. PR #22139 responds by requiring an explicit `Upload debug report? [y/N]` confirmation before any data is sent, keeping automation available through `--yes` but making the default answer “No”.

Worth noting: Until a release includes the confirmation change, do not run or paste `hermes debug share` output blindly. Review generated logs locally, redact conversation and tool content, and prefer private support channels for sensitive diagnostics. Teams should also update bug-report templates so public issues do not encourage raw debug links.

2026-05-09 Product update Watch

Shared agents need searchable tools and user-level access control

A new May 9 cluster shows OpenClaw and Hermes both moving past the single-user demo shape. OpenClaw PR #79823 adds Tool Search Code Mode: instead of pushing every OpenClaw, MCP, and client tool schema into the prompt, the model can search, describe, and call tools through one compact bridge while existing policy, approvals, logging, and loop detection remain in the call path. Hermes PR #22509 takes the shared-agent problem to Discord with Daimon: admins get host-level Hermes, regular users get a Docker-sandboxed agent with iteration caps, per-tool limits, tier-aware routing, and admin controls. Hermes RFC #21574 gives the user story behind it: multi-user gateways quickly need per-user memory, identity, and permissions, or one person can contaminate another user’s agent context. The companion toolset regression #22601 / PR #22608 is the operational warning: optional integrations must not accidentally remove core tools such as terminal, file, web, browser, vision, skills, delegation, cron, and memory.

Worth noting: If you are exposing an agent to teammates, friends, Discord members, or customer channels, do not treat “more tools” and “more users” as simple toggles. Stage tool-search mode with audit logs and approval checks, separate admin and user tiers, put untrusted execution in a sandbox, and regression-test toolset changes after enabling optional integrations. For Hermes, watch Daimon and the per-user isolation RFC; for OpenClaw, watch whether Tool Search Code Mode keeps policy and observability intact under large catalogs.

2026-05-09 Risk note Risk

OpenClaw and Hermes are tightening the boundaries around always-on agents

The strongest new May 9 pattern is not one isolated bug; it is a cluster of boundary repairs around agents that live in chats, browsers, and long-running sessions. OpenClaw PR #79645 keeps transcript redaction centralized at append time, #79649 reduces stale Telegram reply ancestry confusing old replies for the active conversation, #79658 only allows local TXT/JSON/YAML media sends after validation, and #79562 targets Discord queue backpressure plus transcript/session-store read bottlenecks. Hermes PR #22280 hardens Telegram model-picker callback authorization, validates explicit Chrome DevTools Protocol override endpoints before connecting or discovering, and protects detailed health diagnostics behind API-key auth. Hermes PR #22261 fixes Gemini fallback failures when parallel tool responses are split across turns.

Worth noting: If you run persistent agents in Telegram, Discord, browser automation, or API-server mode, treat these as a staging checklist: confirm transcript redaction, callback authorization, topic/reply context, media-send validation, queue backpressure, and health endpoint exposure. Avoid assuming that a working demo is safe for a shared workspace until these boundary cases are tested.

2026-05-09 Risk note Risk

Hermes Kanban agents need progress and resource guardrails before long autonomous runs

The freshest Hermes operator pain is not a release note; it is a pair of live reports about autonomous coding control loops. Issue #22397 says CLI agents assigned clear Kanban tasks can spend 30 minutes to 2+ hours cycling through read/grep/read inspection with no edits, tests, or deliverables. Issue #22406 reports the opposite failure mode: when the agent does reach a build, CPU can stay at 100% and make the macOS host unusable. PR #22467 is adjacent infrastructure for safer background skill evolution: a pending queue that isolates proposed skill changes, deduplicates entries, caps queue size, detects conflicts, and keeps `.pending/` out of active skill enumeration.

Worth noting: Treat long Hermes Kanban runs as supervised until you can observe progress limits and resource limits in your setup. Add explicit max-read/no-progress intervention rules, cap build parallelism or run agents in a constrained sandbox, and review background skill mutations instead of letting them silently land in active skills. If you are testing main, watch for fixes that force action, escalation, or cancellation when an agent remains read-only for too long.

2026-05-09 Risk note Risk

Telegram Bot API 10.0 can push agent replies out of private topics

A fresh Telegram Bot API 10.0 regression is now worth treating as an operator risk, not just a channel bug. Reports show private-chat topic replies that used to work with message_thread_id now return “Bad Request: message thread not found”. Hermes then falls back to sending without the thread id, which can move replies into the bot’s main “All Messages” chat instead of the original topic. The underlying tdlib issue says inbound topic ids are still present, while outbound sendMessage to the same id fails; direct_messages_topic_id may be the new working path for private bot topics.

Worth noting: If your agent uses Telegram threaded/private-topic mode, test a live reply before relying on it today. Prefer a patch that separates private-chat topics from forum/supergroup threads, tries direct_messages_topic_id where supported, and avoids silently falling back to the main private chat when topic isolation matters.

2026-05-08 Product update Watch

Hermes v0.13 turns Kanban into the main durability story

Hermes Agent v0.13.0, “The Tenacity Release”, makes a bigger claim than a normal patch: Kanban is now positioned as a durable multi-agent board with heartbeats, reclaim, zombie detection, auto-block on incomplete exit, per-task retries, and hallucination recovery. The release also adds /goal target-locking across turns, Checkpoints v2 pruning, Gateway auto-resume after restart, cron no_agent watchdog mode, Google Chat as a 20th platform, pluggable providers, seven i18n locales, and security-default changes including redaction on by default.

Worth noting: If Hermes is part of your operator stack, treat v0.13 as an upgrade candidate but not a blind auto-upgrade. Test Kanban recovery, checkpoint pruning, Gateway restart auto-resume, cron watchdog behavior, and channel auth defaults in a staging workspace before moving always-on agents.

2026-05-08 Product update Watch

OpenClaw 2026.5.7 is an operator-safety maintenance release

OpenClaw 2026.5.7 is not a flashy feature drop; it is a broad maintenance release aimed at the paths that decide whether a self-hosted agent stays governable. It tightens native command owner enforcement, requires admin scope for global Active Memory toggles, routes inline skill dispatch through before-tool-call authorization, fixes misleading empty delivery success, repairs cron model overrides and last-channel failures, improves Telegram access-group and polling watchdog behavior, and makes Codex approval prompts less noisy while keeping plugin approval choices accurate.

Worth noting: Upgrade candidates should test this release against their real control surfaces: cron list/show JSON, channels list/model auth commands, /new or sessions.reset after skill changes, Telegram allowlists and poller recovery, Codex approvals, Tavily SecretRef tools, Discord/WhatsApp routing, and plugin install/rollback. It is worth adopting after smoke tests, especially if you rely on always-on channel agents.

2026-05-08 Risk note Risk

OpenClaw and Hermes are tightening the unattended-run paths after release day

The freshest May 8 work is less about new features and more about keeping agents from quietly failing while nobody is watching. OpenClaw now has fixes or reports for cron payload timeouts being collapsed back to a 120s idle watchdog, empty Heartbeat files still burning tokens, subagent completion fallback announcements, Nix-store plugin hardlinks, and fail-closed config writes. Hermes is seeing parallel hardening around cron lock scope and heartbeat ticks, Feishu gateway restarts replaying stale messages into restart loops, safer update installs, refusal to use destructive git reset recovery, terminal /doctor diagnostics, and native-Windows install/startup gaps.

Worth noting: If you run scheduled or channel-driven agents, add this to the upgrade checklist rather than treating it as background noise: verify per-job timeout overrides, confirm Heartbeat only runs on real tasks, test subagent completion delivery, check plugin loading under your packaging model, and smoke-test Hermes cron, Feishu restart, update, Windows, and remote-terminal flows before unattended use.

2026-05-08 Risk note Risk

OpenClaw and Hermes already have a first-day post-release reliability queue

The next useful thing to watch is not another headline feature; it is whether the new tagged releases settle down under real operators. OpenClaw reports now include production event-loop delays, cron jobs skipping instead of trying cloud fallbacks when a local primary model is down, Gemini subagents hanging at stream-ready with zero tokens, session-store recovery after crash/OOM writes, Codex route preservation, and OpenRouter model-id normalization. Hermes reports include automatic TUI heap dumps growing to tens of GiB, skill frontmatter names diverging from directory names, MiniMax OAuth expiry parsing, Telegram media uploads hitting a 20s write-timeout path, and Feishu table rendering regressions in v0.13.0.

Worth noting: Do not treat either latest release as “done” just because the tag exists. If you upgraded today, add smoke tests for cron fallback behavior, Google/Gemini subagents, session-store recovery, OpenRouter/Codex model routes, Hermes TUI disk usage, skill create/edit validation, MiniMax login, Telegram large-media delivery, and Feishu markdown tables. Wait for tagged follow-up fixes before rolling these paths into unattended production agents.

2026-05-08 Risk note Risk

Hermes v0.12 operators should harden Kanban, gateway, and remote TUI paths

The newest Hermes reports are clustering around the parts that make a persistent agent feel safe to leave running: Kanban startup watchers can race on SQLite migrations, dashboard chat turns may leave one slash-worker process per message, terminal-state notifications can repeat every five seconds, cron scripts may ignore their configured workdir, and remote TUI copy shortcuts can interrupt the agent over SSH.

Worth noting: If you run Hermes v0.12 as a daily gateway, restart after pulling fixes, watch process counts and Kanban logs after startup, verify cron jobs from their project directory, and test SSH/TUI copy behavior before long remote sessions. Treat open PRs as work in progress until they land in a tagged release.

2026-05-07 Risk note Risk

OpenClaw 2026.5.6 operators need a wedged-Gateway recovery plan

The newest OpenClaw reports are converging on a practical reliability problem: once the Gateway is saturated or wedged, normal RPC-based restart paths and channel delivery can stop being useful. Operators are reporting 15-100s WebSocket responses, 99-100% event-loop utilization, zombie sessions, node.list errors that hang every agent session, native Codex runtime stalls after tool calls, and embedded direct-lane plugin tools disappearing from allowlists.

Worth noting: Before upgrading production gateways, document an out-of-band supervisor restart path outside OpenClaw RPC, cap risky long-running tasks, keep a direct log/health check handy, and smoke-test native Codex, Feishu/Discord, node.list, and plugin allowlists after restart. Treat PR fixes as promising but unreleased until a tag includes them.

2026-05-07 Risk note Risk

OpenClaw operators should watch update and Gateway freeze regressions

New OpenClaw reports describe two reliability traps around 2026.5.6: package swaps or npm installs can leave old hashed runtime chunks unresolved, breaking the CLI, sessions_send, and web_fetch until restart or a compatibility alias lands; separately, synchronous macOS keychain reads can block the Gateway event loop long enough to cause Telegram timeouts, pending turns, or duplicate sends.

Worth noting: Avoid unattended in-place updates on production gateways until the fix is tagged. After any update, restart cleanly and smoke-test CLI startup, web_fetch, and cross-session delivery. If you see liveness warnings with multi-second eventLoopDelayMaxMs, reduce channel traffic and track the async keychain fix before relying on always-on delivery.

2026-05-07 Risk note Risk

OpenClaw Gateway auth boundaries need a fresh audit

Fresh OpenClaw reports point to several Gateway trust-boundary gaps: managed outgoing image downloads may skip per-session ownership checks for device-token or trusted-proxy callers, trusted-operator plugin HTTP routes may give shared-secret callers admin-like scopes, and trusted-proxy mode may still accept a local password fallback when proxy identity checks fail.

Worth noting: If you run OpenClaw behind trusted-proxy auth, use device tokens, expose plugin HTTP routes, or store generated media in multi-session deployments, treat these as audit items now. Limit network exposure, review route scopes, and wait for a tagged release before assuming the linked PR fixes are present.

2026-05-07 Product update Watch

OpenClaw 2026.5.6 is a fast recovery release for Codex OAuth routing

OpenClaw 2026.5.6 quickly reverts the 2026.5.5 doctor repair that could rewrite valid openai-codex ChatGPT/Codex OAuth routes into openai API-key routes. It also cleans up plugin/runtime fetch headers, debug-proxy replay headers, and web-fetch timeout cleanup so tool lanes do not stay active after timed-out requests.

Worth noting: If you upgraded through 2026.5.5 and use Codex OAuth routes, verify the default model and run the official recovery path if it was rewritten. Also retest plugins or guarded fetch paths that failed on header-shape errors.

2026-05-06 Product update Watch

Mercury 1.1.6 pushes terminal agents toward IDE-like daily use

Mercury v1.1.6 focuses on the parts that decide whether a terminal-first personal agent can stay useful all day: more stable TUI startup and input handling, workspace mode that feels closer to an IDE chat, background tasks and sub-agent completion mirroring, session model switching, platform diagnostics, Spotify controls, and a default web-search skill on fresh installs.

Worth noting: Try it if you are evaluating Mercury as a CLI/Telegram daily driver, especially for long-running coding tasks; check background task cleanup, model switching, and workspace exit behavior before trusting it for unattended work.

2026-05-06 Risk note Risk

OpenClaw operators should audit secrets and delivery after 2026.5.5

The newest OpenClaw issue wave is worth treating as an operator checklist rather than noise: skill SecretRef API keys may still enter exec child environments, delivery can be marked successful even when no channel adapter ran, Telegram subagent fallback can expose raw child output, and new bindings may route to main instead of the intended agent.

Worth noting: Until fixes land in a release, keep sensitive skill keys out of broad agent workspaces where possible, test delivery with real channel receipts instead of status flags alone, and verify every new binding by checking the resulting session key.

2026-05-06 Product update Watch

OpenClaw 2026.5.5 is a reliability release for real deployments

OpenClaw 2026.5.5 is less about one headline feature and more about closing deployment papercuts: channel routing for Feishu, LINE, Telegram/Codex, Discord, Matrix, Slack, and iOS pairing; provider fixes for xAI, Fireworks/Kimi, video hints, and streaming Gateway responses; plus session, plugin, media, doctor, and Control UI repairs.

Worth noting: Treat it as an upgrade candidate if you rely on multiple channels or provider fallbacks, but read the release notes against your own setup before moving production agents.

2026-05-06 Risk note Risk

OpenClaw external plugins may need a post-upgrade check

A new GitHub report says upgrading from 2026.5.2 to 2026.5.3-1 via pnpm can silently drop externally installed plugins such as WhatsApp and BlueBubbles. The practical issue is not the install command itself; it is the lack of warning when primary messaging channels disappear.

Worth noting: Before and after upgrading, record your external plugin list, run channel status checks, and be ready to reinstall affected plugins until the upgrade path preserves them reliably.

2026-05-06 Risk note Risk

OpenClaw 2026.5.4 needs a post-upgrade reliability check

Fresh GitHub reports after 2026.5.4 point to several operator-facing edge cases: Telegram replies can repeat after restart and auto-compaction retry, claude-cli sessions may keep running while OpenClaw-side transcripts stop flushing, and the bundled fal image provider can load without registering for image generation. A cron PR also shows stale future next-run slots can delay scheduled jobs until repaired.

Worth noting: If you run 2026.5.4 in production, spot-check outbound Telegram delivery after restarts, compare OpenClaw and runtime transcript growth on long claude-cli sessions, list image-generation providers if you rely on fal, and verify cron jobs show the next Beijing-time slot you expect.

2026-05-06 Skill Try

A 235-skill library pushes agent skills toward cross-tool packaging

alirezarezvani/claude-skills packages engineering, product, marketing, compliance, C-level, and DevOps expertise as reusable skills and plugins. Its stronger point is cross-agent distribution: Claude Code, OpenClaw, Hermes, Codex, Gemini CLI, Cursor, Aider, Windsurf, OpenCode, and more.

Worth noting: Use it as a reference for taxonomy, packaging, and conversion flows, but review individual skills before installing them into trusted workspaces.

2026-05-05 Product update Watch

OpenClaw 2026.5.4 stabilizes voice calls, plugins, and Gateway hot paths

The 2026.5.4 release focuses less on headline UI and more on operator reliability: snappier Google Meet/Twilio voice bridge behavior, plugin install hints after external-plugin migration, plugin metadata snapshot reuse to cut hot-path scans, safer SecretRef handling, and channel fixes for Discord-style external contracts and QQ active-memory recall.

Worth noting: Upgrade deliberately: test voice-call flows if you use Meet/Twilio, verify external channel plugins and SecretRef-backed tokens after restart, and check gateway startup/performance on your actual workspace.

2026-05-05 Skill Try

Hermes now has a curated ecosystem map, not just a launch narrative

A dedicated awesome-hermes-agent list organizes skills, plugins, deployment options, GUI workspaces, integrations, and maturity tags. That matters because Hermes adoption is moving from “try the agent” toward “assemble an operating stack.”

Worth noting: Use the list as a shortlist source, but separate production-ready resources from beta and experimental entries before recommending them.

2026-05-05 Skill Try

Hermes Skill Atlas turns skill discovery into an offline browser

Hermes Skill Atlas packages a curated skill browser as a dependency-free HTML file, with search, categories, install tabs for Hermes / Claude Code / OpenClaw, and structured JSON data. It is a useful example of discovery moving from raw lists toward operator-friendly tooling.

Worth noting: Use it to compare category design, install guidance, and metadata quality for AgentOS Watch’s own skill/topic pages.

2026-05-04 Product update Watch

Hermes v0.12 adds multi-agent Kanban for parallel work

Hermes now presents multi-agent coordination as a board: agents claim tasks, work in parallel, hand off when blocked, and let the operator unblock progress from one view.

Worth noting: Test it on a bounded project and compare whether the Kanban view reduces supervision overhead versus terminal-based agent orchestration.

2026-05-04 Skill Try

Cross-agent WebSearch skills are becoming a core utility layer

A WebSearch skill positioned for Claude Code, Codex, Cursor, Hermes, OpenClaw, and other agents is getting strong social traction because it solves a common pain: agents need reliable access to social platforms and search engines.

Worth noting: Evaluate install flow, source coverage, local execution claims, and whether results include enough provenance for automated brief generation.

2026-05-04 Risk note Risk

Tool-call data is becoming an economic security risk

X discussion warns that high-volume agent tool calls can become valuable data, especially when routed through proxy or relay services. This turns privacy into an economic incentive problem.

Worth noting: Treat provider routing, proxy endpoints, logs, and skill permissions as part of the product scorecard before recommending any agent setup.

2026-05-04 Use case Try

Content-studio skills are a high-demand use case for agent workflows

Design systems, short-form video workflows, Xiaohongshu cards, and newsletter drafts are becoming a concrete skill category rather than abstract prompt engineering.

Worth noting: Build a Skills Radar scenario pack for content studios: article-to-card, article-to-thread, design system, video script, and newsletter draft.

2026-05-04 New product Watch

Mercury positions itself as an always-on, permission-hardened personal agent

Mercury combines markdown-owned identity, Telegram/CLI channels, daemon mode, scheduled tasks, tool permissions, and token budgeting. The concept matches a real user need: persistent personal agents that do not silently overreach.

Worth noting: Track real usage reports and compare reliability, memory behavior, and permission boundaries against OpenClaw/Hermes.

2026-05-04 Skill Try

Agent skills are exploding; discovery is now the bottleneck

ClawHub, Agent Skills, and community awesome lists show a large and fast-growing skill ecosystem. The opportunity is not another raw directory; users need scenario-based curation, risk notes, and install guidance.

Worth noting: Start with scenario packs: research brief, browser automation, GitHub workflow, content studio, inbox ops, and security.

2026-05-03 Product update Watch

OpenClaw 2026.5.2 ships provider, plugin, gateway, and channel fixes

The release claims sturdier plugin installs, leaner gateway hot paths, multi-channel fixes, and voice/TTS polish. Community feedback also flags context overflow and excessive tool-use regressions, so upgrade deliberately.

Worth noting: Read the changelog, scan issue threads, and test your existing workflows before upgrading production agents.

2026-05-02 Community feedback Watch

Users are asking which personal agent is actually usable

A V2EX discussion captures the core market pain: many agents feel like LLM + tools + skills + IM, but users worry about instability, memory resets, and unreliable workflows.

Worth noting: Create product scorecards focused on reliability, memory persistence, permission safety, and workflow repeatability.