Recurring pattern
False Success in incident response
The agent marked the work resolved while verification evidence still showed the underlying problem was active.
Affected agent
Incident Agent
Occurrences
12
Trend
increasing
Converted to evals
0 / 12
Recommended action
Require recovery verification to pass before incident status can be set to resolved. Treat restart success as an intermediate action rather than the customer outcome.
Promote to a regression eval →Why this is a pattern, not a one-off
Increasing — this failure recurred across 12 traces and is accelerating. Treat it as a workflow/prompt/policy fix, not 12 separate tickets.
Internal status says done, but customers or operators still experience the failure.
Occurrences in this pattern (12)
Open as gap list →| Severity | Failure mode | Evidence | Trace |
|---|---|---|---|
| critical | False Success | agent_status: resolved | Read trace |
| critical | False Success | agent_status: resolved | Read trace |
| critical | False Success | agent_status: resolved | Read trace |
| critical | False Success | agent_status: resolved | Read trace |
| critical | False Success | agent_status: resolved | Read trace |
| critical | False Success | agent_status: resolved | Read trace |
| critical | False Success | agent_status: resolved | Read trace |
| critical | False Success | agent_status: resolved | Read trace |
| critical | False Success | agent_status: resolved | Read trace |
| critical | False Success | agent_status: resolved | Read trace |
| critical | False Success | agent_status: resolved | Read trace |
| critical | False Success | agent_status: resolved | Read trace |
Time window
2026-06-10T14:30:32.601Z/2026-06-11T05:47:29.662Z
Failure modes (1)
Shared evidence signals