feat: forest-dev export-state-tree [skip ci]#6885
Conversation
WalkthroughThis PR introduces a new Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI Handler
participant ChainStore
participant DB as Database<br/>(ManyCar)
participant IPLDStream as IpldStream
participant CarEncoder as CAR Encoder
participant FS as File System
CLI->>ChainStore: Resolve tipsets in range [to, from]
ChainStore->>DB: Query blocks by height
DB-->>ChainStore: Return tipsets
CLI->>ChainStore: Iterate backward over tipsets
loop For each tipset in range
ChainStore-->>CLI: Parent state root, receipts root, headers
CLI->>CLI: Collect IPLD roots
end
CLI->>IPLDStream: Create stream with collected roots
loop Poll stream for blocks
IPLDStream->>DB: Load block by CID
DB-->>IPLDStream: CarBlock data
IPLDStream->>IPLDStream: Extract child CIDs (DAG_CBOR)
IPLDStream-->>CLI: Yield CarBlock
CLI->>CarEncoder: Add block to CAR
end
CLI->>CarEncoder: Compress frames
CarEncoder->>FS: Write to temporary file
CarEncoder->>FS: Persist to output path
FS-->>CLI: Success
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
332b428 to
6a19b7c
Compare
6a19b7c to
30263b6
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/dev/subcommands/export_state_tree_cmd.rs (1)
48-49: Add rustdoc forrun.
ExportStateTreeCommand::runis public and newly introduced, so it should have a brief doc comment like the struct does.As per coding guidelines, "Document public functions and structs with doc comments"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/dev/subcommands/export_state_tree_cmd.rs` around lines 48 - 49, Add a brief rustdoc comment for the public async method ExportStateTreeCommand::run describing its purpose and behavior (e.g., what running the command does and any important side effects or return behavior). Place the doc comment immediately above the fn signature using ///, mirroring the style used for the ExportStateTreeCommand struct and keeping it concise and informative for public API consumers.src/ipld/util.rs (1)
425-440: DocumentIpldStreamandIpldStream::new.Both are new public APIs, but neither has rustdoc yet. Please add short docs covering traversal order and missing-block behavior.
As per coding guidelines, "Document public functions and structs with doc comments"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ipld/util.rs` around lines 425 - 440, Add rustdoc comments for the public struct IpldStream and its constructor IpldStream::new: document that IpldStream traverses IPLD nodes in the order provided by the cid_vec/roots (FIFO or DFS/BFS—state actual traversal used by the implementation), explain how seen: CidHashSet prevents revisiting nodes, and describe missing-block behavior (e.g., whether missing CIDs cause the stream to yield an error, skip, or terminate). Place the docs directly above the pub struct IpldStream<DB> and above pub fn new(db: DB, roots: Vec<Cid>) so users know traversal order, dedup semantics, and how the stream reacts to unavailable blocks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/dev/subcommands/export_state_tree_cmd.rs`:
- Around line 30-45: Add two boolean flags to ExportStateTreeCommand named
message_receipts and events (both #[arg(long)] with default false) so receipts
and event roots are opt-in rather than always included; update the export
invocation code that reads ExportStateTreeCommand to pass these flags into the
exporter/export_state_tree routine so it only includes message_receipts and
events when those flags are true; keep GC snapshot code that currently requires
receipts/events unchanged but explicitly set message_receipts = true and events
= true where snapshots are created for gc (the GC snapshot creator symbol), and
ensure user-facing callers (e.g., the archive/export command symbol) continue to
use the default false values unless the flags are passed.
---
Nitpick comments:
In `@src/dev/subcommands/export_state_tree_cmd.rs`:
- Around line 48-49: Add a brief rustdoc comment for the public async method
ExportStateTreeCommand::run describing its purpose and behavior (e.g., what
running the command does and any important side effects or return behavior).
Place the doc comment immediately above the fn signature using ///, mirroring
the style used for the ExportStateTreeCommand struct and keeping it concise and
informative for public API consumers.
In `@src/ipld/util.rs`:
- Around line 425-440: Add rustdoc comments for the public struct IpldStream and
its constructor IpldStream::new: document that IpldStream traverses IPLD nodes
in the order provided by the cid_vec/roots (FIFO or DFS/BFS—state actual
traversal used by the implementation), explain how seen: CidHashSet prevents
revisiting nodes, and describe missing-block behavior (e.g., whether missing
CIDs cause the stream to yield an error, skip, or terminate). Place the docs
directly above the pub struct IpldStream<DB> and above pub fn new(db: DB, roots:
Vec<Cid>) so users know traversal order, dedup semantics, and how the stream
reacts to unavailable blocks.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 5b8d4c5b-9ef9-43b6-b9c4-c7ede1b9cac7
📒 Files selected for processing (4)
docs/docs/users/reference/cli.shsrc/dev/subcommands/export_state_tree_cmd.rssrc/dev/subcommands/mod.rssrc/ipld/util.rs
| pub struct ExportStateTreeCommand { | ||
| /// Filecoin network chain (e.g., calibnet, mainnet) | ||
| #[arg(long, required = true)] | ||
| chain: NetworkChain, | ||
| /// Optional path to the database folder | ||
| #[arg(long)] | ||
| db: Option<PathBuf>, | ||
| /// The maximum tipset epoch to export state tree from (Exclusive) | ||
| #[arg(long)] | ||
| from: ChainEpoch, | ||
| /// The minimum tipset epoch to export state tree from (Inclusive) | ||
| #[arg(long)] | ||
| to: ChainEpoch, | ||
| /// The path to the output `ForestCAR` file | ||
| #[arg(short, long)] | ||
| output: Option<PathBuf>, |
There was a problem hiding this comment.
Make receipts and events opt-in instead of unconditional.
This command always pulls message receipts and event roots into the export, but there is no flag to keep them out. That makes the default output far heavier than Forest’s other export paths and turns export-state-tree into a much broader snapshot than the name implies.
Based on learnings, "enable message_receipts and events (message_receipts: true, events: true) only for GC snapshots as defined in src/db/gc/snapshot.rs, since these are internal snapshots created during garbage collection. For user-facing export commands such as src/tool/subcommands/archive_cmd.rs, disable receipts and events by default (message_receipts: false, events: false) to keep user-facing snapshots smaller, unless explicitly requested."
Also applies to: 98-109
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/dev/subcommands/export_state_tree_cmd.rs` around lines 30 - 45, Add two
boolean flags to ExportStateTreeCommand named message_receipts and events (both
#[arg(long)] with default false) so receipts and event roots are opt-in rather
than always included; update the export invocation code that reads
ExportStateTreeCommand to pass these flags into the exporter/export_state_tree
routine so it only includes message_receipts and events when those flags are
true; keep GC snapshot code that currently requires receipts/events unchanged
but explicitly set message_receipts = true and events = true where snapshots are
created for gc (the GC snapshot creator symbol), and ensure user-facing callers
(e.g., the archive/export command symbol) continue to use the default false
values unless the flags are passed.
Codecov Report❌ Patch coverage is
Additional details and impacted files
... and 9 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
| if cid.codec() == fvm_ipld_encoding::DAG_CBOR { | ||
| let new_cids = extract_cids(&data)?; | ||
| if !new_cids.is_empty() { | ||
| this.cid_vec.reserve(new_cids.len()); |
There was a problem hiding this comment.
| this.cid_vec.reserve(new_cids.len()); |
Is this useful in any way? I would assume extend would do at most one allocation to accommodate for new_cids, no need to manually reserve it.
Summary of changes
This PR adds a dev tool for exporting state trees together with messages, message receipts and events for a tipset range
Changes introduced in this pull request:
Reference issue to close (if applicable)
Closes
Other information and links
Change checklist
Outside contributions
Summary by CodeRabbit
export-state-treeCLI command enabling users to export parent state trees for specified tipset ranges, with customizable database and output paths