Skip to content

Add debug_traceTransactionProfile endpoint#3267

Open
Kbhat1 wants to merge 10 commits intomainfrom
kartik/trace-profile-endpoint
Open

Add debug_traceTransactionProfile endpoint#3267
Kbhat1 wants to merge 10 commits intomainfrom
kartik/trace-profile-endpoint

Conversation

@Kbhat1
Copy link
Copy Markdown
Contributor

@Kbhat1 Kbhat1 commented Apr 16, 2026

Describe your changes and provide context

  • Adds debug_traceTransactionProfile, a trace endpoint that returns the normal tx trace plus a granular timing/profile breakdown
  • Captures store-level access data during tracing, including module stats, iterator activity, and low-level read events when supported
  • Cleans up trace-scoped resources after each request to avoid leaking snapshots or iterators.

Testing performed to validate your change

  • Verified end to end on node

Adds a granular tracing layer on top of the existing debug_trace path so
operators can see where time is spent during a historical trace.

New RPC endpoint:
- debug_traceTransactionProfile returns the normal trace result plus a
  profile payload that breaks the request into high-level phases
  (block lookup, historical-state replay, tx prep, execution, tracer
  result), and dumps per-module store access stats, iterator stats, and
  any low-level DB read events collected during the trace.

Store tracer plumbing:
- Introduce a StoreTracer on sdk.Context that records per-module
  accesses, iterator lifecycle events, and low-level ReadTraceEvents.
- Thread the tracer through gaskv, cachekv, and tracekv so every store
  operation is timed; cachekv/tracekv forward the collector to their
  parents so the tracer reaches the state store underneath.
- Wire storev2/state.Store to optionally swap its underlying StateStore
  for a traceable wrapper via SetReadTraceCollector, and add forwarding
  WithReadTraceCollector shims on the composite/cosmos/evm SS-layer
  stores so a collector injected from the top reaches the MVCC layer
  when it implements TraceableStateStore.

Lifecycle:
- TraceTransactionProfile (and TraceStateAccess) defer a cleanup that
  closes any trace-scoped resources (Pebble snapshots, reusable
  iterators) attached during the request.

Reporting CLI:
- Add seidb trace-profile-report which takes an RPC endpoint and a
  block range, calls debug_traceTransactionProfile for every tx,
  persists the raw responses, and emits a summary + interactive HTML
  report (phase breakdown, hot modules, hot low-level ops, per-tx
  table).

The low-level Pebble read events populate when the MVCC layer
implements TraceableStateStore (shipped in the companion perf PR); if
that PR is not merged, the profile still shows phase timings, module
access stats, and iterator stats, but the low-level MVCC breakdown will
be empty.

Made-with: Cursor
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 16, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedApr 17, 2026, 9:10 PM

@Kbhat1 Kbhat1 changed the title feat: add debug_traceTransactionProfile endpoint and reporting tool Add debug_traceTransactionProfile endpoint and reporting tool Apr 17, 2026
outputDir is a CLI flag on this offline seidb reporting tool, not a
request-level input; suppress the file-inclusion warning with a
nolint:gosec comment matching conventions elsewhere in sei-db/tools.
@Kbhat1 Kbhat1 changed the title Add debug_traceTransactionProfile endpoint and reporting tool Add debug_traceTransactionProfile endpoint Apr 17, 2026
Audit found the closer plumbing on StoreTracer is dormant: nothing in the
repo calls AddReadTraceCloser, so readClosers is always empty and both
CloseReadTraceResources and the closeStoreTraceResources wrapper in
evmrpc are no-ops. Remove all of it. When low-level (pebbledb) read
tracing gets wired in via TraceableStateStore in a follow-up, the closer
side can return with the add side in one change.

Also add doc comments to the newly-introduced exported surface that the
profile endpoint depends on: StoreTracer and its types
(ModuleTrace/IteratorTrace/Access/OpType), record methods
(Get/Set/Has/Delete/StartIterator/RecordIteratorValue/RecordIteratorNext),
Dump/DerivePrestateToJson/RecordReadTrace, and the seidb-db types
(ReadTraceEvent/ReadTraceCollector/TraceableStateStore/NewReadTraceEvent).
Also annotate the per-tx sample caps with the reason they exist.

No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.24%. Comparing base (9225538) to head (e37d869).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3267      +/-   ##
==========================================
- Coverage   59.30%   59.24%   -0.07%     
==========================================
  Files        2071     2070       -1     
  Lines      169814   169318     -496     
==========================================
- Hits       100707   100304     -403     
+ Misses      60333    60264      -69     
+ Partials     8774     8750      -24     
Flag Coverage Δ
sei-db 70.41% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
evmrpc/block_trace_profiled.go 0.70% <ø> (+<0.01%) ⬆️
sei-cosmos/store/gaskv/store.go 92.39% <ø> (-0.88%) ⬇️
sei-cosmos/types/tracer.go 83.33% <100.00%> (-6.67%) ⬇️
sei-db/tools/cmd/seidb/main.go 0.00% <ø> (ø)

... and 67 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

The CLI already emits raw_profiles.jsonl and summary.json; the third
artifact (report.html) was a 335-line embedded template with picked-
from-a-hat styling, no tests, and no durable callers. Anyone who wants
a visual from summary.json can render it with any tool. Dropping the
template shrinks the CLI file from 855 to 507 lines and removes the
html/template dependency.

- Remove writeTraceHTMLReport + its template block
- Remove traceReportData wrapper type (unused after)
- Remove the report.html write + fprintln in runTraceProfileReport
- Tighten --output-dir help to name the two remaining artifacts

No API or endpoint change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Kbhat1 Kbhat1 requested a review from yzang2019 April 17, 2026 18:28
Kbhat1 and others added 2 commits April 17, 2026 14:36
Nothing on this branch emits ReadTraceEvents: no state store backend
implements TraceableStateStore, so every WithReadTraceCollector
forwarder hits a type-assertion that fails at the leaf, RecordReadTrace
has zero callers, and LowLevelStats/LowLevelSamples on the response are
always empty. The test already gates its assertions with
`if foundAnyLowLevelStats` and passes vacuously.

Same pattern as the earlier AddReadTraceCloser cut: ship the plumbing
with the emitter, not ahead of it. A follow-up PR (on top of perf-mvcc
once its MVCC layer grows a TraceableStateStore implementation) can
restore the interfaces, the forwarders, the RecordReadTrace handler,
the LowLevelStats wire fields, and the CLI aggregation in one change
that's testable end-to-end.

Removed:
- ReadTraceEvent / ReadTraceCollector / TraceableStateStore /
  NewReadTraceEvent from sei-db/db_engine/types/types.go
- WithReadTraceCollector on composite/cosmos/evm SS stores
- SetReadTraceCollector passthroughs on cachekv/tracekv/storev2
- injectReadTraceCollector + its call sites in context.go KVStore /
  TransientStore
- activeStore / readTraceCollector fields on storev2 state.Store
- LowLevelReads field on ModuleTrace, LowLevelStats / LowLevelSamples
  fields on ModuleTraceDump, ReadTraceEventDump type,
  maxLowLevelReadSamples const, RecordReadTrace method and the
  low-level branch in StoreTracer.dumpLocked
- LowLevelStats / LowLevelTotals aggregation in seidb
  trace-profile-report CLI
- low-level assertions in evmrpc TestTraceTransactionProfile

The KVStore-level tracer (Get/Has/Set/Delete + iterator tracking) that
powers Stats and iterators on the profile response is untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The one-line comments I added on Get/Set/Has/Delete/OpType.String
just restated the function names without adding information. The
non-trivial docstrings (StartIterator's iteratorID contract,
RecordIteratorValue's Truncated semantics, Dump's prestate-exclusion
rule, DerivePrestateToJson's consumer, the cap constants) stay.

No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread sei-cosmos/store/gaskv/store.go
Comment thread sei-cosmos/store/gaskv/store.go Outdated
Comment thread sei-cosmos/store/gaskv/store.go Outdated
Kbhat1 and others added 2 commits April 17, 2026 16:23
gs.parent.Has(key) does the same underlying lookup whether the key
exists or not. The tracer's whole point is to surface where time was
spent, so filtering out the miss case hides exactly the pattern we
most want to see in a profile ("tx spent X ms on 500 Has() calls
that all missed"). Drop the && res guard so Has is traced on every
call, matching Get/Set/Delete.

Spotted by @yzang2019 in review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Kbhat1 and others added 2 commits April 17, 2026 16:58
Every Get/Set/Has/Delete/Iterator/Next call was unconditionally
calling time.Now() to stage a duration for the tracer, even when no
tracer is attached (the common case in production). Guard each
time.Now() on gs.tracer != nil so the non-traced hot path is
unaffected by the tracing machinery.

Spotted by @yzang2019 in review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…guards

The 'var start time.Time; if tracer != nil { start = time.Now() }' dance
was repeated at every Get/Set/Has/Delete/iterator/Next call site. Fold
it into a tiny helper so each call site becomes 'start := traceStart(
tracer)'. Behavior is identical: time.Now() only fires when a tracer is
attached, same as before; every tracer call site still nil-checks
before invoking. Addresses @yzang2019's review comment consistently
across the 6 sites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants