← All updates
2026-05-14 Risk note Risk

Hermes approval hardening closes a critical YOLO-mode bypass and exposes long-session failure modes

The strongest Hermes item in the late May 14 window is approval safety, not another UI tweak. PR #23835 says `HERMES_YOLO_MODE` was read from `os.getenv()` on every approval check, so a skill or prompt-injected in-process tool could mutate `os.environ` and disable command approval checks after startup. The same PR tightens LLM smart-approval parsing from substring matching to exact `APPROVE`, logs dangerous background auto-approvals that previously had no audit trail, and expands pipe-to-shell detection to catch `/bin/bash` and `bash -c` variants. Nearby reliability work matters for the same operator audience: PR #25716 adds hierarchical long-context compression so huge transcripts can be summarized in bounded segments instead of timing out, while issue #25723 reports that one streaming provider error can disable streaming for an entire session rather than just the failing request.

ImpactRisk Sources3 Audienceoperator · developer · team
Why it matters

Permission gates are only useful if untrusted in-process code cannot turn them off. The combination of approval parsing, audit logs, context compression, and streaming fallback controls is what makes long-running agent sessions inspectable after failures.

Evidence
  • PR #23835 labels the YOLO-mode environment re-read as critical because in-process code could set `HERMES_YOLO_MODE=true` after startup
  • The same PR changes smart approval from substring matching to exact `APPROVE`, adds warning logs for non-interactive dangerous auto-approvals, and expands pipe-to-shell detection
  • PR #25716 adds hierarchical map-reduce compression for very large transcripts and rehydrates persisted handoff summaries before recompression
  • Issue #25723 reports streaming being disabled for the whole session after one provider streaming error
  • Issue #25710 notes Telegram streaming can skip final MarkdownV2 formatting when raw text is unchanged
Risk notes
  • The approval hardening PR was still open when reviewed, so production builds may not include the fixes yet
  • YOLO/auto-approval modes remain high-risk even with better parsing and logging
  • Compression and streaming fixes need real long-session/provider-failure tests, not only unit tests