Web monitoring pipeline — track page changes, capture visual diffs, and guard against monitoring pitfalls. Built from QuartzUnit libraries.
flowchart LR
A["🔗 diffgrab\nchange detection"] --> B["📄 markgrab\ncontent extraction"]
A --> C["📸 snapgrab\nvisual capture"]
B --> D["🛡️ llm-degen-guard\noutput quality"]
B --> E["🔄 agent-loop-guard\nloop detection"]
B --> F["📋 agent-action-policy\naction safety"]
pip install watchdeck
# Add URLs to monitor
watchdeck add https://example.com
watchdeck add https://news.ycombinator.com --interval 12
# Check for changes
watchdeck check
# View history
watchdeck history https://example.com
# See diff between snapshots
watchdeck diff https://example.com- Detect — Tracks page changes via diffgrab (content hashing + structured diffs)
- Extract — Pulls full content via markgrab for quality validation
- Screenshot — Captures visual snapshots via snapgrab on change (optional)
- Guard — Three safety layers:
- agent-action-policy: blocks monitoring of internal/private URLs
- agent-loop-guard: detects stale monitoring patterns
- llm-degen-guard: flags garbage content (CAPTCHA, bot detection pages)
No cloud services, no API keys. Everything runs locally.
pip install watchdeckRequirements: Python 3.11+, Playwright (for screenshots: playwright install chromium)
Add a URL to monitor. Blocked URLs (localhost, private IPs, file://) are automatically rejected.
watchdeck add https://example.com # default: check every 24h
watchdeck add https://news.ycombinator.com -i 12 # check every 12h| Option | Short | Default | Description |
|---|---|---|---|
--interval |
-i |
24 | Check interval in hours |
Check all monitored URLs for changes.
watchdeck check # check all
watchdeck check -u https://example.com # specific URL
watchdeck check --screenshots # capture screenshots on changeOutput:
Monitor Check (3 URLs, 1240ms)
┌──────────────────────────┬───────────┬─────────┬──────────┐
│ URL │ Status │ Changes │ Warnings │
├──────────────────────────┼───────────┼─────────┼──────────┤
│ https://example.com │ CHANGED │ +5/-2 │ │
│ https://news.ycombinator │ unchanged │ │ │
│ https://old-page.com │ unchanged │ │ stale │
└──────────────────────────┴───────────┴─────────┴──────────┘
1 changes detected
1 stale URLs (consider reducing frequency)
Stop monitoring a URL.
Show snapshot history.
watchdeck history https://example.com -n 10Show diff between snapshots.
watchdeck diff https://example.com
watchdeck diff https://example.com --before 1 --after 3import asyncio
from watchdeck import WatchDeck
async def main():
deck = WatchDeck()
# Add URLs (safety policy auto-applied)
await deck.add("https://example.com", interval_hours=12)
await deck.add("http://localhost:8080") # → blocked by policy
# Check for changes
report = await deck.check()
for result in report.results:
if result.changed:
print(f"{result.url}: {result.summary}")
if result.stale_warning:
print(f" ⚠ {result.stale_warning}")
if result.content_warning:
print(f" ⚠ {result.content_warning}")
# History and diffs
snapshots = await deck.history("https://example.com")
diff = await deck.diff("https://example.com")
await deck.close()
asyncio.run(main())watchdeck integrates three QuartzUnit guard libraries to prevent common monitoring pitfalls:
Automatically blocks monitoring of:
localhost,127.0.0.1- Private networks (
192.168.*,10.*,172.16-31.*) file://URLs
deck = WatchDeck()
success, msg = await deck.add("http://192.168.1.1/admin")
# success=False, msg="Cannot monitor internal/private network URLs"Detects when a URL hasn't changed for N consecutive checks and suggests reducing frequency:
⚠ URL unchanged for 5 consecutive checks — consider reducing frequency
Flags pages that return garbage content (CAPTCHA pages, bot detection, repetitive filler):
⚠ Content appears degenerate (score=0.78) — possible CAPTCHA or anti-bot page
Data is stored in ~/.watchdeck/ by default:
~/.watchdeck/
├── tracker.db # diffgrab snapshots + change history
Custom location:
deck = WatchDeck(db_dir="/path/to/data")flowchart TD
A["watchdeck add URL"] --> B["Initial snapshot\n(diffgrab + markgrab + snapgrab)"]
B --> C["watchdeck check"]
C --> D{"Content\nchanged?"}
D -->|"yes"| E["Compute diff\n+ screenshot\n+ guard checks"]
D -->|"no"| F["Check stale threshold"]
E --> G["📊 Report"]
F --> G
| Library | Role in watchdeck | PyPI |
|---|---|---|
| diffgrab | Page change detection + structured diffs | pip install diffgrab |
| markgrab | Content extraction for quality checks | pip install markgrab |
| snapgrab | Visual screenshot capture on change | pip install snapgrab |
| agent-action-policy | URL safety policy (block internal IPs) | pip install agent-action-policy |
| agent-loop-guard | Stale monitoring detection | pip install agent-loop-guard |
| llm-degen-guard | Garbage content detection | pip install llm-degen-guard |
See also: newswatch — news monitoring pipeline (feedkit + markgrab + embgrep + diffgrab)
Part of the QuartzUnit ecosystem — composable Python libraries for data collection, extraction, search, and AI agent safety.