2026-05-14 Risk note Risk

Hermes approval hardening closes a critical YOLO-mode bypass and exposes long-session failure modes

The strongest Hermes item in the late May 14 window is approval safety, not another UI tweak. PR #23835 says `HERMES_YOLO_MODE` was read from `os.getenv()` on every approval check, so a skill or prompt-injected in-process tool could mutate `os.environ` and disable command approval checks after startup. The same PR tightens LLM smart-approval parsing from substring matching to exact `APPROVE`, logs dangerous background auto-approvals that previously had no audit trail, and expands pipe-to-shell detection to catch `/bin/bash` and `bash -c` variants. Nearby reliability work matters for the same operator audience: PR #25716 adds hierarchical long-context compression so huge transcripts can be summarized in bounded segments instead of timing out, while issue #25723 reports that one streaming provider error can disable streaming for an entire session rather than just the failing request.

ImpactRisk Sources3 Audienceoperator · developer · team

Why it matters

Permission gates are only useful if untrusted in-process code cannot turn them off. The combination of approval parsing, audit logs, context compression, and streaming fallback controls is what makes long-running agent sessions inspectable after failures.

Evidence

PR #23835 labels the YOLO-mode environment re-read as critical because in-process code could set `HERMES_YOLO_MODE=true` after startup
The same PR changes smart approval from substring matching to exact `APPROVE`, adds warning logs for non-interactive dangerous auto-approvals, and expands pipe-to-shell detection
PR #25716 adds hierarchical map-reduce compression for very large transcripts and rehydrates persisted handoff summaries before recompression
Issue #25723 reports streaming being disabled for the whole session after one provider streaming error
Issue #25710 notes Telegram streaming can skip final MarkdownV2 formatting when raw text is unchanged

Risk notes

The approval hardening PR was still open when reviewed, so production builds may not include the fixes yet
YOLO/auto-approval modes remain high-risk even with better parsing and logging
Compression and streaming fixes need real long-session/provider-failure tests, not only unit tests

Related reading

Hermes’ workflow core arrives beside a URL-safety bypass fix and profile-scoped scheduled jobs

Hermes adds structured HTTP and argv tools to escape the bash-quoting trap

Hermes’ latest reliability cluster is about disappearing Web UI streams, failed compression, vision fallbacks, and dashboard auth