Speed up historical MVCC reads by Kbhat1 · Pull Request #3266 · sei-protocol/sei-chain

Kbhat1 · 2026-04-16T21:11:28Z

Describe your changes and provide context

Faster historical reads: flip PebbleDB version encoding to descending. Newest version sorts first
IMPORTANT: Open-time compatibility gate: pebbledb stamps an s/_mvcc_descending sentinel on fresh DBs and refuses to open legacy ascending-version
Adds a mvcc ascending flag
Read session for traces: one Pebble snapshot + reusable iterators per request

Roughly another ~2x speedup on debug_trace

Testing performed to validate your change

Verified end to end on node

MVCC PebbleDB changes: - Invert version ordering so newest versions sort first per key, making historical lookups a direct SeekGE instead of an unbounded reverse seek. - Introduce a per-request historical read session that holds one Pebble snapshot plus reusable iterators per store prefix, eliminating the per-read iterator create/destroy overhead. - Add a request-scoped (module, key, version) read cache so repeated hot-key reads inside a single trace skip Pebble entirely. - Add a sentinel-versioned latest-value fast path (math.MaxInt64 per key) usable as a bloom-filter accelerated db.Get when tracing at or past the latest committed height. - Harden batch writes: sort MVCC batches before commit, guard against empty store keys/keys to keep state-sync imports safe, and commit the latest-version metadata key separately so it never violates Pebble's strictly-increasing batch invariant. - Extend iterator navigation to drive MVCC seeks by logical key under the new descending-version encoding, with forward/reverse helpers and defensive skips for the sentinel pointer. Tracing hooks: - Add ReadTraceCollector / ReadTraceEvent / TraceableStateStore types in db_engine/types so callers can optionally observe low-level reads. The MVCC Database implements TraceableStateStore as a no-op when no collector is attached, so this change is zero-cost in the default configuration; the actual consumers live in the follow-up profiling PR. evmrpc: - Add a block-level replay cache keyed by block hash + tx index so a debug trace request that replays txs [0..N) against the historical state reuses the post-tx context from earlier calls in the same block instead of replaying from scratch. No migration is required: the new on-disk version encoding is applied on fresh state-synced data, and the latest-value sentinel is additive. Made-with: Cursor

github-actions · 2026-04-16T21:12:25Z

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`✅ passed`	`✅ passed`	`✅ passed`	Apr 17, 2026, 8:08 PM

codecov · 2026-04-16T21:16:04Z

Codecov Report

❌ Patch coverage is 38.68195% with 428 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.23%. Comparing base (b57aeb8) to head (39ae166).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...i-db/db_engine/pebbledb/mvcc/iterator_ascending.go	0.00%	180 Missing ⚠️
sei-db/db_engine/pebbledb/mvcc/db_ascending.go	18.23%	122 Missing and 8 partials ⚠️
sei-db/db_engine/pebbledb/mvcc/db.go	68.15%	38 Missing and 19 partials ⚠️
sei-db/db_engine/pebbledb/mvcc/batch.go	53.12%	26 Missing and 4 partials ⚠️
sei-db/db_engine/pebbledb/mvcc/iterator.go	69.76%	21 Missing and 5 partials ⚠️
sei-db/db_engine/pebbledb/mvcc/comparator.go	83.33%	3 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3266      +/-   ##
==========================================
- Coverage   59.30%   59.23%   -0.08%     
==========================================
  Files        2071     2073       +2     
  Lines      169811   170250     +439     
==========================================
+ Hits       100713   100843     +130     
- Misses      60327    60623     +296     
- Partials     8771     8784      +13

Flag	Coverage Δ
sei-chain-pr	`48.64% <38.68%> (?)`
sei-db	`70.41% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
sei-db/db_engine/pebbledb/mvcc/comparator.go	`68.42% <83.33%> (+4.61%)`	⬆️
sei-db/db_engine/pebbledb/mvcc/iterator.go	`66.36% <69.76%> (+10.34%)`	⬆️
sei-db/db_engine/pebbledb/mvcc/batch.go	`52.63% <53.12%> (+10.77%)`	⬆️
sei-db/db_engine/pebbledb/mvcc/db.go	`66.08% <68.15%> (-0.13%)`	⬇️
sei-db/db_engine/pebbledb/mvcc/db_ascending.go	`18.23% <18.23%> (ø)`
...i-db/db_engine/pebbledb/mvcc/iterator_ascending.go	`0.00% <0.00%> (ø)`

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

ReplayTransactionTillIndex was storing sdk.Context values whose MultiStore field still pointed at the live replay state. The -1 checkpoint was cached before replay started but then advanced in place as transactions were applied, and later cache hits reused the stored checkpoint directly. Store a branched CacheMultiStore snapshot when caching a checkpoint, and branch again when resuming from one, so each replay starts from an immutable saved state instead of mutating the cached checkpoint itself. Made-with: Cursor

sei-db prune: the existing NextPrefix guard was written for ascending version order. Under the new descending encoding the newest version of a key is seen first, so whenever the newest version was above the prune height the iterator skipped to the next logical key and every older version (including ones below the prune height) was leaked on disk. Rewrite the loop around per-logical-key state: on entering a new key with newest > prune, seek straight to the first version <= prune, then keep one version below the prune height (when KeepLastVersion=true) and delete the rest. The fast-path seek preserves the old optimization of skipping above-prune versions. evmrpc replay cache: replayStateCache grew one entry per (block, tx) forever with no eviction, each holding a branched CacheMultiStore. Wrap it in an expirable LRU bounded by block count (32) and TTL (10m); keep per-block mutexes so inner-map updates stay safe under concurrent traces. Regression tests: - sei-db/.../prune_test.go exercises the descending bug directly by reading raw pebble entries after Prune, covering keepLast true/false, mixed above/below-prune keys, and sentinel preservation. They fail on the pre-fix code. - evmrpc/simulate_cache_test.go covers get/put, LRU eviction past the cap, and concurrent access under -race.

- Change replayStateCacheMu to *sync.Mutex so Backend copies stay safe under copylocks (Backend has value-receiver methods and tracers.go dereferences the pointer into a local). - Annotate two int64->uint64 conversions on non-negative block heights with nolint:gosec, matching existing conventions in this package. - Wrap recordReadTrace defer calls in closures so time.Since is evaluated at defer fire time, not at defer registration. - Drop redundant Database selector from tracedDatabase iterator calls.

Drops the reserved math.MaxInt64 latest-pointer entry and returns the historical read path to the inverted-order SeekGE lookup plus the request-scoped read cache and reusable iterator session. Benchmarks showed the sentinel pointer made traces slower, not faster: the extra db.Get per read plus the write amplification from emitting a pointer entry per set outweighed the bloom-filter-accelerated hit. This keeps the other MVCC wins while removing the sentinel. Also drops the sentinel-skip branches in Prune/RawIterate/iterator and the corresponding sentinel-specific prune test.

- Drop unused batchOp.order field; sortBatchOps sorts by key only. - Drop unused NewReadTraceEvent helper and its time import. - Fix misleading Reverse:true trace events in getMVCCSlice and getMVCCSliceWithSession; these paths are forward SeekGE + First. - Drop stale "reverse iteration is NOT supported" comment on iterator. - Rename shadowed builtin cap to size in replay-cache eviction test.

- Move latestVersionKey Set into the pebble batch so data and version metadata commit atomically, restoring the invariant that crash recovery observes a consistent version pointer. - Extract decodeMVCCEntry to share the post-seek validate/decode/clone logic between getMVCCSlice and getMVCCSliceWithSession. - Drop the fine-grained per-step trace events on the Get hot path and route the remaining top-level timing through traceGetMVCCSlice, which short-circuits when no collector is attached (no slices.Clone in the default zero-collector path). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Drops the ReadTrace types, per-op trace hooks, tracedDatabase wrapper, and historicalReadSession read-cache. This branch is scoped to MVCC perf changes; the tracing plumbing belongs with the debug trace-profile endpoint PR and can be re-introduced there once it lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Reject legacy ascending-version databases on open. On a populated DB lacking the s/_mvcc_descending marker, return an error that points operators at state sync; on an empty DB, write the marker so future opens fast-path. Protects against silent version mis-decoding after an in-place upgrade to the new encoding. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Archive nodes that cannot state-sync need to keep reading and writing their existing ascending-version MVCC databases. Rather than forcing a migration, detect the on-disk encoding at open time and dispatch to the matching code path. - OpenDB sets Database.descending from the s/_mvcc_descending sentinel: present or empty DB -> descending (fast path, marker written on fresh DBs); marker absent on a populated DB -> ascending legacy mode, no error, no marker write. - Public Get/Has/Iterator/ReverseIterator/Prune dispatch to *Descending or *Ascending implementations. Batch threads the mode through writes (NewBatchWithMode) so ApplyChangeset / Import / DeleteKeysAtVersion persist in the DB's native encoding. - File layout makes the split obvious for review: - db_descending.go, iterator_descending.go -- perf fast path (this branch's rewrite) - db_ascending.go, iterator_ascending.go -- verbatim port of main's legacy implementation, only adjusted to use MVCCEncodeAscending and the ascendingIterator type - comparator.go -- both encode/decode variants plus a parameterized MVCCEncode(key, v, descending) for shared call sites - Tests cover all three open paths (fresh-writes-marker, legacy-opens- ascending, marked-roundtrip) including a write+read+reopen cycle on a seeded legacy DB to confirm the ascending path is end-to-end correct. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Drop three ornamental wrappers flagged by review: - Batch.appendSet / Batch.appendDelete / RawBatch.appendSet were each five-line clone-and-append helpers called 1-3 times; inline at the call sites. - NewBatch / NewRawBatch were zero-argument wrappers that always forwarded with descending=true to NewBatchWithMode. Drop the wrappers, rename NewBatchWithMode -> NewBatch and NewRawBatchWithMode -> NewRawBatch so the mode is required at the one kind of call site that exists. - snapshotReplayState was a one-liner ctx.WithMultiStore(ctx.MultiStore() .CacheMultiStore()) wrapper with two call sites. Inline it. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The block-level replay-state LRU only paid off when trace requests clustered on the same block in quick succession. evmrpc handles one call at a time and can't observe whether callers are clustering; workload-aware prefix reuse belongs above this layer (in a block explorer, indexer, or tracing harness that actually knows its access pattern). Baking it into the RPC added stateful fields on Backend, an LRU size / TTL we picked without data, and a double-lock put path that existed only to serve a workload the RPC can't verify. Drop the cache entirely: - blockReplayState, replayStateCache, replayStateCacheMu, replayStateCacheBlocks, replayStateCacheTTL from evmrpc/simulate.go - getReplayState, putReplayState methods - the cache check + two putReplayState calls in ReplayTransactionTillIndex; unconditionally replay from tx 0 - evmrpc/simulate_cache_test.go (no longer meaningful) - the golang-lru/v2/expirable import (still used elsewhere in evmrpc) Per-call perf is unaffected: the descending-version MVCC encoding still makes each replayed read cheap, which is the right layer for a per-call optimization. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The three-way file split (db.go / db_descending.go / db_ascending.go) implied two equal modes when the intent is one canonical path plus a backstop. Fold the descending implementation back into db.go so the canonical Database lives in one place, and keep db_ascending.go and iterator_ascending.go as the clearly-labeled legacy carveouts. Rename iterator_descending.go -> iterator.go for the same reason: it is the iterator; iterator_ascending.go is the compat variant. File layout after: - db.go - Database, OpenDB, dispatchers, writes, descending mode implementation, shared helpers, metrics - db_ascending.go - legacy ascending-mode carveout for pre-migration DBs - iterator.go - descending iterator (the iterator) - iterator_ascending.go - legacy iterator carveout No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two identical code paths were waiting for a single kernel: - retrieveLatestVersion and retrieveEarliestVersion differed only in the metadata key name. Fold both into retrieveVersionKey(db, key). - Batch.Write and RawBatch.Write share the otel metrics defer, the pebble batch open/close, sortBatchOps, and the ops-application loop. The only delta is the latestVersionKey stamp in Batch.Write. Extract writeBatchOps(storage, ops, beforeCommit) and pass the stamp as a hook from Batch.Write; RawBatch.Write passes nil. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Leftover from the earlier replay-cache revert — the one-line whitespace delta was the only difference from main in this file. Match main exactly so simulate.go drops out of the PR's diff stat. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Kbhat1 mentioned this pull request Apr 16, 2026

Add debug_traceTransactionProfile endpoint #3267

Open

Kbhat1 changed the title ~~perf: speed up historical MVCC reads and trace replays~~ Speed up historical MVCC reads and trace replays Apr 17, 2026

Kbhat1 added 2 commits April 16, 2026 20:24

Kbhat1 added the non-app-hash-breaking label Apr 17, 2026

Kbhat1 and others added 10 commits April 17, 2026 00:08

Merge branch 'main' into kartik/perf-mvcc

aad4455

Kbhat1 changed the title ~~Speed up historical MVCC reads and trace replays~~ Speed up historical MVCC reads Apr 17, 2026

Kbhat1 requested a review from yzang2019 April 17, 2026 18:28

Kbhat1 and others added 2 commits April 17, 2026 15:27

yzang2019 approved these changes Apr 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up historical MVCC reads #3266

Speed up historical MVCC reads #3266
Kbhat1 wants to merge 16 commits intomainfrom
kartik/perf-mvcc

Kbhat1 commented Apr 16, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

codecov bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Kbhat1 commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes and provide context

Testing performed to validate your change

Uh oh!

github-actions bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Kbhat1 commented Apr 16, 2026 •

edited

Loading

github-actions bot commented Apr 16, 2026 •

edited

Loading

codecov bot commented Apr 16, 2026 •

edited

Loading