Recurring pattern
False Success in incident response
The agent marked the work resolved while verification evidence still showed the underlying problem was active.
Affected agent
Incident Agent
Occurrences
9
Trend
increasing
Converted to evals
0 / 9
Recommended action
Require metric-recovery verification artifact with status 'passed' before allowing incident status to be set to 'resolved'. Block ticket close when health check returns 'unknown'.
Promote to a regression eval →Why this is a pattern, not a one-off
Increasing — this failure recurred across 9 traces and is accelerating. Treat it as a workflow/prompt/policy fix, not 9 separate tickets.
Internal status says done, but customers or operators still experience the failure.
Occurrences in this pattern (9)
Open as gap list →| Severity | Failure mode | Evidence | Trace |
|---|---|---|---|
| critical | False Success | agent_status is 'resolved' but tool 'check-service-health' returned status 'unknown' — no stable metric baseline was established. | Read trace |
| critical | False Success | agent_status is 'resolved' but tool 'check-service-health' returned status 'unknown' — no stable metric baseline was established. | Read trace |
| critical | False Success | agent_status is 'resolved' but tool 'check-service-health' returned status 'unknown' — no stable metric baseline was established. | Read trace |
| critical | False Success | agent_status is 'resolved' but tool 'check-service-health' returned status 'unknown' — no stable metric baseline was established. | Read trace |
| critical | False Success | agent_status is 'resolved' but tool 'check-service-health' returned status 'unknown' — no stable metric baseline was established. | Read trace |
| critical | False Success | agent_status is 'resolved' but tool 'check-service-health' returned status 'unknown' — no stable metric baseline was established. | Read trace |
| critical | False Success | agent_status is 'resolved' but tool 'check-service-health' returned status 'unknown' — no stable metric baseline was established. | Read trace |
| critical | False Success | agent_status is 'resolved' but tool 'check-service-health' returned status 'unknown' — no stable metric baseline was established. | Read trace |
| critical | False Success | agent_status is 'resolved' but tool 'check-service-health' returned status 'unknown' — no stable metric baseline was established. | Read trace |
Time window
2026-06-11T05:47:08.198Z/2026-06-11T05:47:15.783Z
Failure modes (1)
Shared evidence signals