Skip to content

Use bump allocation in DRC free list and other improvements#12969

Merged
fitzgen merged 2 commits intobytecodealliance:mainfrom
fitzgen:free-list-improvements
Apr 7, 2026
Merged

Use bump allocation in DRC free list and other improvements#12969
fitzgen merged 2 commits intobytecodealliance:mainfrom
fitzgen:free-list-improvements

Conversation

@fitzgen
Copy link
Copy Markdown
Member

@fitzgen fitzgen commented Apr 6, 2026

  • Avoid using a BTreeMap and use a cache-friendly Vec instead.
  • When merging blocks in the free list, use linear search, only falling back to binary search when the free list is large.
  • Add fast-path entry points that take a u32 size directly that has already been rounded to the free list's alignment.

Altogether, this shaves off ~309B instructions retired (48%) from the benchmark in #11141

@fitzgen fitzgen requested a review from a team as a code owner April 6, 2026 16:20
@fitzgen fitzgen requested review from pchickey and removed request for a team April 6, 2026 16:20
@github-actions github-actions bot added wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:ref-types Issues related to reference types and GC in Wasmtime labels Apr 6, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 6, 2026

Subscribe to Label Action

cc @fitzgen

Details This issue or pull request has been labeled: "wasmtime:api", "wasmtime:ref-types"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: wasmtime:ref-types

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@alexcrichton
Copy link
Copy Markdown
Member

With this approach, can't deallocation be quadractic w.r.t. the number of blocks? Insofar as this has removal/insertion from a sorted vector which is O(n), so deallocating N items could result in quadratic runtime?

@fitzgen
Copy link
Copy Markdown
Member Author

fitzgen commented Apr 6, 2026

Hm yeah I suppose so. I can re-add the BTreeMap for the non-bump portion of the free list, which should avoid that behavior.

@alexcrichton
Copy link
Copy Markdown
Member

For the bump part though, is this just a tradeoff of rss vs throughput (or something like that)? I'm not sure how big GC heaps are by default but if they're big enough this seems not great where it'll unconditionally fill up the entire linear memory before reusing anything that's been deallocated. Or is the initial size of linear memories typically smaller?

@fitzgen
Copy link
Copy Markdown
Member Author

fitzgen commented Apr 6, 2026

Or is the initial size of linear memories typically smaller?

The initial size is one Wasm page.

After that, right now, we always prefer growing the GC heap to collecting when possible, but that is changing in #12942

Also add fast-path entry points that take a `u32` size directly that has already
been rounded to the free list's alignment.

Altogether, this shaves off ~309B instructions retired (48%) from the benchmark
in bytecodealliance#11141
@fitzgen fitzgen force-pushed the free-list-improvements branch from 14c182c to fc82c63 Compare April 6, 2026 20:16
@fitzgen
Copy link
Copy Markdown
Member Author

fitzgen commented Apr 6, 2026

Latest commit switched back to BTreeMap for the non-bump portion of the free list.

@fitzgen fitzgen requested review from alexcrichton and removed request for pchickey April 7, 2026 15:15
Copy link
Copy Markdown
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I still sort of feel like the true way forward here will be to compile the allocation algorithm to wasm itself. We talked about this historically, and I realize it's far out of scope for this PR, but I feel like we're going to otherwise endlessly golf various forms of allocators here and there with heuristics and such, whereas putting in wasm with a strict API would make it much easier to experiment and play with at least.

Comment on lines +263 to +266
let (&block_index, &block_len) = self
.free_block_index_to_len
.iter()
.find(|(_, len)| **len >= alloc_size)?;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, well, uh, I guess this is quadratic as well... I realize this is preexisting, though, so sorry I didn't see this earlier :(

Mind tagging this with a FIXME and an issue? We presumably don't want to make this tier 1 with a quadratic-complexity allocator...

Copy link
Copy Markdown
Member Author

@fitzgen fitzgen Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: nevermind, with enough extra btrees we can make this work

@fitzgen
Copy link
Copy Markdown
Member Author

fitzgen commented Apr 7, 2026

Personally I still sort of feel like the true way forward here will be to compile the allocation algorithm to wasm itself. We talked about this historically, and I realize it's far out of scope for this PR, but I feel like we're going to otherwise endlessly golf various forms of allocators here and there with heuristics and such, whereas putting in wasm with a strict API would make it much easier to experiment and play with at least.

I agree, I don't think that we need to touch the free list again after this PR (plus the follow up fixme) until if/when we self-host the free list.

@fitzgen fitzgen enabled auto-merge April 7, 2026 19:55
@fitzgen fitzgen added this pull request to the merge queue Apr 7, 2026
Merged via the queue into bytecodealliance:main with commit 0096013 Apr 7, 2026
48 checks passed
@fitzgen fitzgen deleted the free-list-improvements branch April 7, 2026 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:ref-types Issues related to reference types and GC in Wasmtime

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants