fix: use cumulative prefix sets for incremental trie state root#445
fix: use cumulative prefix sets for incremental trie state root#445
Conversation
b74a166 to
baa5064
Compare
akundaz
left a comment
There was a problem hiding this comment.
The benchmark and tests run against replicated state root calculations instead of the actual production code path. Can you extract the code to run this calculation and and update the benchmark and tests to call that directly? Otherwise we can't really draw conclusions about real world performance from these results. And if the code drifts we should be able to catch with our tests and the benchmark.
c866328 to
56da125
Compare
|
The I would also like to see better encapsulation of the state root computation. Right now it's spread across state_root.rs and payload.rs but we should try to get it all in state_root.rs (the unit test should go there as well). I would suggest that instead of |
This change was just illustrative to prove that the issue is reproducible and that the new code fixes the issue. I can remove it if you're satisfied this fix works with reverts. |
That's fine! Just make sure to have a test with reverts since it triggers the bug |
akundaz
left a comment
There was a problem hiding this comment.
Just adding a few comments. I think once they are resolved and CI passes this will probably be good
The incremental trie cache produces wrong state roots when a storage slot modified in flashblock N reverts in flashblock N+1. The reverted slot disappears from the cumulative HashedPostState, so its nibble path is missing from the prefix set. The trie walker skips the subtree and uses the stale cached hash from the previous flashblock's branch node. Fix: track cumulative TriePrefixSetsMut across all flashblocks. Before building TrieInput, extend the current prefix sets with all prior flashblocks' prefix sets. This forces the walker to re-visit every path modified in earlier flashblocks. The fix is also ~30% faster than the unfixed incremental path because descending into cached in-memory nodes is faster than DB cursor seeks for skipped branches. Benchmarks (10 flashblocks, 50k accounts): - Without cache: ~2,200ms (baseline) - Incremental (no fix): ~844ms (2.6x faster, incorrect on reverts) - Incremental (with fix): ~650ms (3.4x faster, correct) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract the incremental state root calculation logic from build_flashblock_payload_inner into builder::state_root::compute_state_root. Tests and benchmarks now call the same function as production code, ensuring they exercise the actual code path rather than replicated logic that could drift. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace free function and manual state threading with a StateRootCalculator struct that owns cached trie updates and cumulative prefix sets internally. FlashblocksState always holds a calculator; incremental vs full mode is configured at construction. Move tests into state_root module and add e2e test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reorder pub use statements in builder/mod.rs to satisfy nightly rustfmt (unblocks the failed lint job). Update test_incremental_state_root expected count from 15 to 18 — the flashblocks listener captures the base/fallback flashblock (index 0) in addition to the 5 incremental flashblocks built per block, so 3 blocks yield 18 messages. Tighten the StateRootCalculator usage in payload.rs to bind the default calculator to a let so the &mut borrow lives long enough, drop the unused is_incremental helper, and consolidate the two simulate_flashblocks test helpers behind a populate_trie flag. Switch the bench to import StateRootCalculator from the public re-export. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ompute Hoist the `!self.incremental` case out of compute() so the rest of the function can assume incremental mode. This drops both `if self.incremental` checks (one inside the else arm that gated seeding the cumulative prefix sets, one after the match that gated writing back to self) and removes the three-tuple return from the match. The two paths now read as: non-incremental returns immediately; incremental builds cumulative prefix sets, runs the cached or full computation, and writes the cache unconditionally. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8f7d3e9 to
735e2cb
Compare
Fix: track cumulative TriePrefixSetsMut across all flashblocks. Before building TrieInput, extend the current prefix sets with all prior flashblocks' prefix sets. This forces the walker to re-visit every path modified in earlier flashblocks.
The fix is also ~30% faster than the unfixed incremental path because descending into cached in-memory nodes is faster than DB cursor seeks for skipped branches.
Benchmarks (10 flashblocks, 50k accounts):