# Failure Categories

| Key | Value |
|-----|-------|
| Status | Active |
| Owner | QA Automation |
| Updated | 2026-03-26 |
| Scope | Symptom categories, root-cause labels, and how to interpret failures without overreacting |

One of the easiest ways to make test operations worse is to mix up symptom and cause. A timeout is not a root cause. A selector miss is not always a test bug. This page separates the categories the system uses so people can interpret failures more accurately.

## Two Different Category Layers

| Layer | Purpose |
|-------|---------|
| symptom category | what the failure looked like technically |
| root-cause label | what we believe actually caused it |

Example:

- symptom category: `TIMEOUT_ELEMENT`
- root-cause label: `PRODUCT.SITE_REDESIGN`

That distinction matters because the action is different.

## Common Symptom Categories

| Symptom Category | What It Usually Means |
|------------------|-----------------------|
| `SELECTOR_NOT_IN_DOM` | expected element is gone or renamed |
| `TIMEOUT_ELEMENT` | element or action did not complete in time |
| `ELEMENT_INTERCEPTED` | something blocked an interaction |
| `NETWORK_FAILED` | navigation or resource request failed |
| `HTTP_5XX` | backend or edge returned server error |
| `HTTP_4XX` | page or endpoint not found or not allowed |
| `PAGE_CRASHED` | browser page crashed |
| `CONTENT_MISMATCH` | expected content shape does not match live output |
| consent-related categories | overlay or consent handling blocked progress |

## Common Root-Cause Themes

The incident system uses more human-meaningful causes than raw failure text.

| Root-Cause Domain | Typical Meaning |
|-------------------|-----------------|
| `PRODUCT.*` | the site changed in a real way |
| `TEST.*` | the test or selector assumption is wrong |
| `ENVIRONMENT.*` | CI, timing, or network instability |
| `CONTENT_DATA.*` | content is missing, rotated, or structurally different |
| `THIRD_PARTY.*` | outside dependency drift |
| `RELEASE_CHANGE.*` | deploy or release-specific side effect |

## How Operators Should Read A Failure

| If You See... | Ask Next... |
|---------------|-------------|
| isolated timeout | is this a flaky one-off or a known weak spot? |
| repeated selector miss on one site | did the site redesign or did our selector age out? |
| same failure across unrelated tests | is this really infra or shared environment noise? |
| post-fix repeat | is the fix incomplete, or is this a new symptom of an old issue? |

## Confidence Labels In Slack

These are not raw categories. They are operator-facing verdicts built from categories plus context.

| Label | What It Tells The Reader |
|-------|--------------------------|
| `Confirmed regression` | strong evidence of a real product or test regression |
| `Needs confirmation` | not enough signal yet |
| `Likely flaky` | behavior looks intermittent rather than systemic |
| `Infra/CI suspicion` | shared environment or runner problem is more likely |
| `Post-fix, watching` | issue matches something recently fixed and is being monitored |
| `Likely same infra event` | secondary failure probably belongs to the same environment issue |

## Why Historical Context Matters

A single failure run can be misleading. History changes interpretation:

- one timeout after a fix may be watch-only
- ten repeats over seven days is a pattern
- a failure with a matching resolved incident is very different from a brand-new unknown

This is why failure history, incident store data, and recovery tracking now matter so much.

| Pattern | Good First Response |
|---------|---------------------|
| one failing test, no history, selectors look healthy | rerun and watch |
| several unrelated tests on one site all hit `NETWORK_FAILED` in one run | treat as infra suspicion first |
| same PDT test repeats after a visible redesign | likely product or test update needed |
| visual suite says snapshot missing | configuration issue, not UI regression |

## What Not To Do

- do not equate timeout with product bug automatically
- do not treat every selector miss as test negligence
- do not post “regression detected” when the suite is actually blocked by missing baselines or infra
- do not bury the real verdict under raw classifier vocabulary

## Related Pages

| Need | Go To |
|------|-------|
| human-facing alert behavior | [Reporting](./reporting.md) |
| incident matching and AI workflows | [AI Processing](./ai-processing.md) |
| troubleshooting specific failures | [Troubleshooting](./troubleshooting.md) |
