# Monitoring

| Key | Value |
|-----|-------|
| Status | Active |
| Owner | QA Automation |
| Updated | 2026-03-26 |
| Scope | Proactive monitors that run alongside or outside the main test suites |

Monitoring in this repo is broader than scheduled Playwright suites. Some risks are better caught by focused monitors than by full regression runs. The job of a monitor is not to prove the whole site works. It is to catch a narrow class of breakage early and cheaply.

## Monitoring Versus Testing

| Monitoring | Testing |
|------------|---------|
| narrow and focused | broader behavioral coverage |
| often runs more frequently | usually runs on a schedule tied to risk or cost |
| catches drift early | confirms user-facing behavior more deeply |
| optimized for signal-to-noise | optimized for confidence |

## Current Monitors

| Monitor | What It Watches |
|---------|-----------------|
| consent monitor | consent dialog presence and selector health |
| selector monitor | known key selectors across sites |
| robots.txt and sitemap monitor | crawl policy changes and sitemap health |
| URL monitor | broken links, redirects, bad targets, crawl drift |
| Grafana visual monitor | empty or broken dashboard panels in a real browser |

## Consent Monitor

The consent monitor exists because consent breakage can make many unrelated tests look broken.

| What It Checks | Why It Matters |
|----------------|----------------|
| dialog presence | legal and access path |
| accept interaction | users can get past the wall |
| selector health | prevents false failures in many suites |

Main command: `npm run monitor:consent`

## Selector Monitor

The selector monitor is a lightweight “are our assumptions still true?” check for important surfaces.

| What It Catches | Example |
|-----------------|---------|
| removed selectors | redesign or CMS changes |
| weakened fallbacks | config drift |
| site-specific breakage | one site changes while others remain healthy |

Main command: `npm run monitor:selectors`

## robots.txt And Sitemap Monitor

This is one of the most useful narrow monitors in the repo because it watches a high-stakes surface that product teams can change without obviously breaking the homepage.

| Monitor Part | What It Detects |
|--------------|-----------------|
| robots.txt change detection | crawl-policy drift and accidental blocking |
| sitemap health | missing or broken sitemap endpoints |

Current operating model:

- daily check
- all 16 CNC sites
- deduplicated alerting
- stored state for change detection

```mermaid
%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#4a90d9', 'primaryTextColor': '#fff', 'primaryBorderColor': '#2c6fad', 'lineColor': '#555', 'fontFamily': 'sans-serif'}}}%%
flowchart TD
    S["For each of 16 sites"] --> A["Fetch robots.txt"]
    A --> B["SHA-256 hash"]
    B --> C{Changed?}
    C -- Yes --> D["Slack alert\nwith line diff\n@Martin"]
    C -- No --> E["Update lastChecked"]

    A --> F["Parse Sitemap: lines\n(auto-discovered)"]
    F --> G["Fetch each sitemapindex\n(GET)"]
    G --> H["Extract nested loc URLs"]
    H --> I["Check all URLs\nconcurrently"]
    I --> J{Any non-200?}
    J -- Yes --> K["Deduplicate\n(7-day window)"]
    K --> L["Slack alert\nfor new failures\n@Martin"]
    J -- No --> M["All OK — silent"]
```

Main command: `npm run monitor:robots-txt`

## URL Monitor

The URL monitor is the link-quality system. It crawls, classifies, deduplicates, and reports broken or risky links.

| Strength | Why It Matters |
|----------|----------------|
| crawl depth variants | teams can choose speed versus coverage |
| source tracking | broken links can be traced back to where they were found |
| deduplication | operators get one useful incident instead of spam |
| history | recurring URL issues are easier to spot |

Main commands:

| Goal | Command |
|------|---------|
| standard crawl | `npm run monitor:urls:crawl` |
| nav-focused crawl | `npm run monitor:urls:crawl:nav` |
| full crawl | `npm run monitor:urls:crawl:full` |
| broken-link report | `npm run monitor:urls:broken` |
| Slack summary | `npm run monitor:urls:slack` |

## Grafana Visual Monitor

This monitor checks dashboards in a browser instead of trusting API-level health alone.

| It Looks For | Why It Matters |
|--------------|----------------|
| “No data” panels | query or datasource drift |
| broken panel rendering | dashboard can be technically deployed but not useful |
| missing expected panels | accidental dashboard regressions |

Main commands:

| Goal | Command |
|------|---------|
| run monitor | `npm run monitor:grafana` |
| run and alert | `npm run monitor:grafana:slack` |

## How To Think About Frequency

| Monitor Type | Good Frequency |
|--------------|----------------|
| consent and selector health | frequent |
| robots.txt and sitemap health | daily |
| URL monitoring | scheduled depth depending on cost |
| Grafana visual quality | regular, but not as often as low-cost health checks |

## Good Operational Practice

- use focused monitors to catch narrow but expensive problems early
- do not expect a single monitor to replace a full suite
- route monitor alerts differently from regression alerts when the urgency differs
- prefer one well-clustered alert over many noisy duplicates

One practical pattern looks like this:

1. monitors catch drift early
2. Slack posts a low-noise alert
3. operator checks whether the issue is isolated or broad
4. if needed, a deeper suite runs to confirm user impact

## Related Pages

| Need | Go To |
|------|-------|
| full suite purposes | [Test Types](./test-types.md) |
| alert behavior | [Reporting](./reporting.md) |
| URL monitoring deep dive | [URL Monitor System](../confluence/URL-MONITOR-CONFLUENCE.md) |
| robots.txt monitor deep dive | [robots.txt & Sitemap Monitor](../confluence/ROBOTS-TXT-MONITOR-CONFLUENCE.md) |
