Skip to content

docs: design for decoupling snapshot generation from publishing#107

Merged
bdchatham merged 3 commits intomainfrom
docs/design-snapshot-publish-config
Apr 20, 2026
Merged

docs: design for decoupling snapshot generation from publishing#107
bdchatham merged 3 commits intomainfrom
docs/design-snapshot-publish-config

Conversation

@bdchatham
Copy link
Copy Markdown
Collaborator

@bdchatham bdchatham commented Apr 20, 2026

Summary

  • Design doc proposing a nested TendermintSnapshotGenerationConfig under SnapshotGenerationConfig so we can grow sibling modes (seidb, app-layer, …) without renaming fields.
  • Adds an optional publish: {} sub-struct whose presence enables S3 upload — absence means "generate + retain locally only." Closes the gap where a full-node can't produce N on-disk snapshots without also pushing to S3.
  • Rejects enablePersistence / enabled: bool in favor of present/absent struct to match existing CRD precedent (StateSyncSource struct{} at api/v1alpha1/common_types.go:107, and the snapshot: { stateSync: {} | s3: {...} } shape already used for bootstrap sources).
  • No launched product → no migration / backward-compat work; sample manifests and tests are updated alongside the type change in the implementation PR.

Key shape

spec:
  fullNode:
    snapshotGeneration:
      tendermint:
        keepRecent: 5
        publish: {}          # omit to keep snapshots local-only

Open questions deferred to LLD

  1. Sidecar signal mechanism — explicit flag in rendered config vs. keying on bucket-var absence.
  2. Where cross-field validation (publish set ⇒ keepRecent >= 2) lives — Condition on reconcile vs. OpenAPI oneOf vs. new webhook.
  3. Operational note for any already-running dev archives that previously uploaded and would stop once publish is omitted.

Test plan

  • Review the shape and naming (publish: {} vs. enabled: bool) — biggest design decision in the doc
  • Confirm the outer wrapper (SnapshotGenerationConfig { Tendermint: ... }) is worth the nesting level vs. collapsing to tendermintSnapshotGeneration: directly on the spec
  • Validate the assumption that the sidecar is the component doing the S3 upload (§2 of the design)
  • Decide which of the three open questions need LLD-level answers before implementation vs. can be decided in the implementation PR

🤖 Generated with Claude Code

bdchatham and others added 3 commits April 20, 2026 12:18
Introduces a nested TendermintSnapshotGenerationConfig under
SnapshotGenerationConfig and a present/absent publish struct so operators
can generate local Tendermint snapshots without uploading to S3. Matches
the existing StateSyncSource / SnapshotSource presence-as-toggle pattern
in common_types.go. No launched product, so no backward-compat concern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Replace implicit sidecar signal with a dedicated configure-snapshot-publish
  plan task, mirroring ConfigureStateSync. Publishing becomes self-describing
  in .status.plan rather than keyed on env-var absence.
- Pin cross-field validation (publish set -> keepRecent >= 2, empty
  snapshotGeneration) to the existing fullNodePlanner.Validate /
  archiveNodePlanner.Validate hooks. No admission webhook. Note why
  config-validate is the wrong layer for this rule.
- Update rollout steps and related-work pointers accordingly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The sidecar task (TaskTypeSnapshotUpload / SnapshotUploadTask) already
exists in seictl v0.0.30. The controller-side wiring for it was deleted
in #89 along with the monitor-task subsystem. This design re-introduces
that wiring as a plan task rather than inventing a new sidecar task.

Updates:
- Rewrite sec. 2 to reference the existing seictl types, not a new task
- Adjust rollout step to describe re-wiring rather than defining
- Related work now cites the seictl task and the deletion commit

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bdchatham bdchatham merged commit 4804001 into main Apr 20, 2026
2 checks passed
bdchatham added a commit that referenced this pull request Apr 21, 2026
* feat: decouple snapshot generation from publishing

Refactors SnapshotGenerationConfig to nest a TendermintSnapshotGenerationConfig
sub-struct (extensible for future snapshot modes) with an optional Publish
struct whose presence toggles S3 upload. Absence of Publish means snapshots
are kept on disk only.

Re-wires the existing seictl TaskTypeSnapshotUpload sidecar task as a
fire-and-forget plan task, inserted into the init progression immediately
before mark-ready only when Publish is set. Restores controller-side
plumbing that was removed in #89 with the monitor-task subsystem, now as
a plan task rather than a monitor task.

Cross-field validation lives in each per-mode planner's Validate(node) hook,
wrapping errors with its own mode prefix (fullNode: / archive:) — no shared
helper takes a mode parameter. Rules: empty snapshotGeneration is rejected;
keepRecent must be >= 2 when Publish is set (upload algorithm requires the
second-to-latest snapshot).

Implements the design in docs/design-snapshot-publish-config.md (merged
via #107). No backward-compat concerns: product has not launched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: address PR review feedback

- Inline the snapshot-upload gate into buildBasePlan and
  buildPostBootstrapProgression; remove the maybeInsertSnapshotUpload
  helper (called in only two places, "maybe" prefix was unclear).
- Trim validateSnapshotGeneration doc comment to a single line — the
  rules are self-evident from the code; only the caller-wrap convention
  is worth calling out.

Tracking issue for the sei-config rename suggestion:
sei-protocol/sei-config#6

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant