docs: Documentation audit — outdated, incorrect, and missing content

## Summary

A comprehensive audit of `docs/` compared against the current state of the codebase (primarily smartem-decisions, smartem-frontend, smartem-devtools). The docs are **structurally sound** with good organisation and useful ADRs, but they largely reflect the state of the codebase from mid-to-late 2025. Significant additions since then are undocumented, and several claims are now factually wrong.

Findings are grouped by severity.

---

## Incorrect (will mislead readers)

These will cause confusion or errors if followed.

| Doc | Claim | Reality |
|-----|-------|---------|
| `backend/api-server.md` | `python -m smartem_backend.simulate_msg` and `./tools/simulate-messages.sh` exist | Neither exists. The tool is `tools/external_message_simulator.py` |
| `backend/api-server.md` | `./scripts/k8s/dev-k8s.sh up` (implies it's in smartem-decisions) | Script lives in **smartem-devtools** `scripts/k8s/`, not in smartem-decisions |
| `backend/database.md` | Lists 3 migrations (baseline, indexes, prediction models) | There are **7 migrations** — missing: quality metrics, schema drift fixes, suggested acquisition index, agent log table |
| `backend/database.md` | Baseline migration ID `6e6302f1ccb6` | Actual baseline file is `2025_09_18_1042-001_create_core_smartem_schema_baseline.py` — ID likely differs |
| `operations/kubernetes.md` | `./scripts/k8s/dev-k8s.sh` (implies in smartem-decisions) | Lives in **smartem-devtools** `scripts/k8s/` |
| `development/generate-docs.md` | `tox -e docs` to generate documentation | No `tox.ini` exists in smartem-decisions. Docs generation has moved elsewhere |
| `development/e2e-simulation.md` | `./tests/e2e/run-e2e-test.sh` (implies in smartem-decisions) | Lives in **smartem-devtools** `tests/e2e/` |
| `glossary.md` | ARIA = "Automated Real-time Image Analysis" | ARIA is a central metadata repository for structural biology data from multiple facilities (INSTRUCT-ERIC) — not real-time image analysis |

---

## Severely Incomplete (missing major functionality)

These documents exist but cover only a fraction of current functionality.

| Doc | What's documented | What actually exists |
|-----|-------------------|---------------------|
| `backend/api-documentation.md` | ~8 API endpoints | Reality has **60+ endpoints** including atlas tiles, quality predictions, agent sessions/logs, debug endpoints, frontend SSE stream, ordered foilholes, latent representations |
| `backend/database.md` | Implies ~5 core tables | Reality has **22 tables** including quality prediction models/weights/parameters, quality metrics/statistics, agent log/connection/session/instruction/acknowledgement, atlas tiles, overall quality predictions |
| `agent/cli-reference.md` | Documents parse/validate/watch | Missing the installed `smartem-agent` CLI entry point (via pyproject.toml `[project.scripts]`), and 5 other CLI entry points (`smartem.agent-cleanup`, `smartem.register-prediction-model`, `smartem.init-model-weight`, `smartem.random-model-predictions`, `smartem.random-prior-updates`) |
| `operations/environment-variables.md` | Lists DB and RabbitMQ vars | Missing `CORS_ALLOWED_ORIGINS`, `SMARTEM_BACKEND_CONFIG`, and `appconfig.yml` YAML-based configuration (database pool settings, `particle_select_batch_size`, etc.) |
| `development/tools.md` | Lists 4 tools | Missing `db_table_totals.py`, `check_schema_drift.py`, `generate_api_docs.py`, `makeiso.sh` |

---

## Stale / Outdated (not wrong per se, but no longer reflects current state)

| Doc | Issue |
|-----|-------|
| smartem-decisions `README.md` | Still calls itself "proof of concept" — the system is production-grade with 60+ endpoints, 22 DB tables, CI/CD, K8s deployment, ML pipeline |
| `backend/api-server.md` | Only covers basic API/consumer startup. Doesn't mention `appconfig.yml`, `frontend_stream.py` (frontend SSE), agent log submission, or the ML prediction pipeline |
| `operations/containerization.md` | Documents multi-stage `developer/build/runtime` stages — this matches `Dockerfile.dev`, but the production `Dockerfile` is simpler (installs from PyPI). Docs don't distinguish between the two |
| `operations/containerization.md` | Image name `ghcr.io/diamondlightsource/smartem-decisions` — CI actually uses `ghcr.io/${{ github.repository }}` (case-sensitive) |

---

## Entirely Missing Documentation

No docs exist for these areas:

- **ML prediction pipeline** — quality prediction models, model weights, training tables, metrics aggregation
- **Frontend SSE stream** — `GET /frontend/events/stream` for real-time UI updates
- **Agent logging** — `POST /agent/{agent_id}/session/{session_id}/logs` and the `agentlog` table (added March 2026)
- **Debug endpoints** — agent session management, test instruction creation
- **Image serving** — atlas and gridsquare image endpoints (`GET /grids/{grid_uuid}/atlas_image`, `GET /gridsquares/{gridsquare_uuid}/gridsquare_image`)
- **appconfig.yml** — YAML-based application configuration (DB pool tuning, batch sizes, log file path)
- **smartem-frontend** — no documentation at all. Frontend is now a **monorepo** with npm workspaces (`apps/legacy`, `apps/smartem`, `packages/api`, `packages/ui`), React 19, MUI 7, TanStack Router, Tailwind CSS 4, Orval API client generation
- **smartem-devtools webui** — the developer dashboard (React 19, Vite, MDX) has no documentation
- **smartem-workspace upgrade path** — `uvx` caches tools, so `uvx smartem-workspace` keeps serving the previously-installed version after a new PyPI release. Users must run `uvx --refresh smartem-workspace ...`, `uvx smartem-workspace@latest ...`, or `uv cache clean smartem-workspace` to pick up a new version. Release notes and the smartem-workspace README should call this out whenever a new version ships.

---

## Suggested Priority

1. **Fix incorrect claims** (wrong script paths, phantom modules, wrong ARIA definition) — these actively mislead
2. **Update database.md** — migration list and table inventory are significantly behind
3. **Update API documentation** — endpoint coverage is ~13% of reality
4. **Add frontend docs** — entire subsystem undocumented
5. **Fill in missing topics** (ML pipeline, appconfig, image serving, agent logging)
6. **Refresh stale content** (README "proof of concept", containerization docs)

---

## Recycle e2e test notes (from #152)

PR #152 landed a comprehensive `tests/e2e/README.md` in smartem-devtools (+346 lines covering prerequisites, env config, recordings, automated and manual execution, replay speeds, multi-microscope setup). That content is now the freshest source of truth on how to run the e2e suite and should be recycled into the wider docs/skills surface so it does not drift:

- [ ] **Update `repos/DiamondLightSource/smartem-decisions/docs/development/e2e-simulation.md`** — script path is wrong (script lives in smartem-devtools `tests/e2e/`, not smartem-decisions); align procedures with the new README instead of duplicating them, and link out to it as the authoritative source
- [ ] **Update the Claude skill(s) related to e2e testing** — fold in the latest prerequisites (k8s NodePorts, `.env.local-test-run`, recordings table, `uv sync --all-extras`) and the single-/multi-microscope invocation forms so Claude stops referencing stale paths/procedures
- [ ] **Cross-link from smartem-devtools top-level `README` / docs index** so the e2e README is discoverable without knowing to look under `tests/e2e/`
- [ ] **Decide on single source of truth** — either keep the long-form guide in `tests/e2e/README.md` and have `docs/development/e2e-simulation.md` be a thin pointer, or move the content into `docs/` and keep `tests/e2e/README.md` as the thin pointer. Avoid maintaining both in parallel.


Doc	Claim	Reality
`backend/api-server.md`	`python -m smartem_backend.simulate_msg` and `./tools/simulate-messages.sh` exist	Neither exists. The tool is `tools/external_message_simulator.py`
`backend/api-server.md`	`./scripts/k8s/dev-k8s.sh up` (implies it's in smartem-decisions)	Script lives in smartem-devtools `scripts/k8s/`, not in smartem-decisions
`backend/database.md`	Lists 3 migrations (baseline, indexes, prediction models)	There are 7 migrations — missing: quality metrics, schema drift fixes, suggested acquisition index, agent log table
`backend/database.md`	Baseline migration ID `6e6302f1ccb6`	Actual baseline file is `2025_09_18_1042-001_create_core_smartem_schema_baseline.py` — ID likely differs
`operations/kubernetes.md`	`./scripts/k8s/dev-k8s.sh` (implies in smartem-decisions)	Lives in smartem-devtools `scripts/k8s/`
`development/generate-docs.md`	`tox -e docs` to generate documentation	No `tox.ini` exists in smartem-decisions. Docs generation has moved elsewhere
`development/e2e-simulation.md`	`./tests/e2e/run-e2e-test.sh` (implies in smartem-decisions)	Lives in smartem-devtools `tests/e2e/`
`glossary.md`	ARIA = "Automated Real-time Image Analysis"	ARIA is a central metadata repository for structural biology data from multiple facilities (INSTRUCT-ERIC) — not real-time image analysis

Doc	What's documented	What actually exists
`backend/api-documentation.md`	~8 API endpoints	Reality has 60+ endpoints including atlas tiles, quality predictions, agent sessions/logs, debug endpoints, frontend SSE stream, ordered foilholes, latent representations
`backend/database.md`	Implies ~5 core tables	Reality has 22 tables including quality prediction models/weights/parameters, quality metrics/statistics, agent log/connection/session/instruction/acknowledgement, atlas tiles, overall quality predictions
`agent/cli-reference.md`	Documents parse/validate/watch	Missing the installed `smartem-agent` CLI entry point (via pyproject.toml `[project.scripts]`), and 5 other CLI entry points (`smartem.agent-cleanup`, `smartem.register-prediction-model`, `smartem.init-model-weight`, `smartem.random-model-predictions`, `smartem.random-prior-updates`)
`operations/environment-variables.md`	Lists DB and RabbitMQ vars	Missing `CORS_ALLOWED_ORIGINS`, `SMARTEM_BACKEND_CONFIG`, and `appconfig.yml` YAML-based configuration (database pool settings, `particle_select_batch_size`, etc.)
`development/tools.md`	Lists 4 tools	Missing `db_table_totals.py`, `check_schema_drift.py`, `generate_api_docs.py`, `makeiso.sh`

Doc	Issue
smartem-decisions `README.md`	Still calls itself "proof of concept" — the system is production-grade with 60+ endpoints, 22 DB tables, CI/CD, K8s deployment, ML pipeline
`backend/api-server.md`	Only covers basic API/consumer startup. Doesn't mention `appconfig.yml`, `frontend_stream.py` (frontend SSE), agent log submission, or the ML prediction pipeline
`operations/containerization.md`	Documents multi-stage `developer/build/runtime` stages — this matches `Dockerfile.dev`, but the production `Dockerfile` is simpler (installs from PyPI). Docs don't distinguish between the two
`operations/containerization.md`	Image name `ghcr.io/diamondlightsource/smartem-decisions` — CI actually uses `ghcr.io/${{ github.repository }}` (case-sensitive)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: Documentation audit — outdated, incorrect, and missing content #187

Summary

Incorrect (will mislead readers)

Severely Incomplete (missing major functionality)

Stale / Outdated (not wrong per se, but no longer reflects current state)

Entirely Missing Documentation

Suggested Priority

Recycle e2e test notes (from #152)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

docs: Documentation audit — outdated, incorrect, and missing content #187

Description

Summary

Incorrect (will mislead readers)

Severely Incomplete (missing major functionality)

Stale / Outdated (not wrong per se, but no longer reflects current state)

Entirely Missing Documentation

Suggested Priority

Recycle e2e test notes (from #152)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions