# robots.txt & Sitemap Monitor

| Key | Value |
|-----|-------|
| Status | Active |
| Owner | QA Automation |
| Updated | 2026-03-26 |
| Scope | Daily crawl-policy and sitemap health monitoring across CNC sites |

This monitor watches a part of the stack that is easy to overlook and expensive to get wrong. A homepage can look fine while `robots.txt` silently blocks crawling or a sitemap starts returning errors.

## What The Monitor Checks

| Check | Why It Matters |
|-------|----------------|
| robots.txt change detection | catches accidental crawl-policy changes |
| sitemap health | catches broken discovery endpoints |

## Current Operating Model

| Metric | Current Snapshot |
|--------|------------------|
| monitored sites | 16 CNC sites |
| frequency | daily |
| alert style | deduplicated Slack alerts |
| state tracking | stored baseline/history file |

## Why This Is High-Stakes

| Problem | Real Risk |
|---------|-----------|
| bad robots.txt change | search engines stop crawling important surfaces |
| broken sitemap endpoint | new content discovery degrades |
| silent removal of sitemap URLs | SEO and content freshness suffer before anyone notices |

## Why A Dedicated Monitor Is Better Than A Generic Test

This is not a classic user-flow problem. It is a publishing and discoverability problem. The focused monitor is better because it:

- runs cheaply
- tracks change over time
- alerts only when something meaningful changes
- speaks to SEO and platform health, not just frontend rendering

## Slack Behavior

The monitor is designed to avoid spam.

| Reporting Principle | What It Means |
|---------------------|---------------|
| deduplication | one issue should not flood the channel repeatedly |
| change-aware alerts | no alert when nothing changed |
| operator-readable output | the alert should explain what changed, not just say “failed” |

## Good Use Cases

| Use Case | Why The Monitor Helps |
|----------|-----------------------|
| accidental deploy change | catches policy drift before it becomes a long-running SEO issue |
| backend issue in sitemap generation | catches failures even when the homepage still works |
| silent SEO regressions | surfaces issues that UI tests will not catch |

[EXPAND: What to do when it alerts]

1. confirm whether the change was intentional
2. check whether the impact is one site or many
3. decide whether the issue belongs to SEO, platform, or release engineering
4. keep the thread open until the next healthy run confirms recovery

[END EXPAND]

## Related Pages

| Need | Go To |
|------|-------|
| general monitoring overview | [Monitoring](../wiki/monitoring.md) |
| command catalog | [CLI Reference](../wiki/cli-reference.md) |
