Skip to content

Separate chunker from batcher#625

Open
antiguru wants to merge 33 commits intoTimelyDataflow:master-nextfrom
antiguru:explicit_chunker
Open

Separate chunker from batcher#625
antiguru wants to merge 33 commits intoTimelyDataflow:master-nextfrom
antiguru:explicit_chunker

Conversation

@antiguru
Copy link
Copy Markdown
Member

@antiguru antiguru commented Jul 15, 2025

The chunker was part of the batcher and responsible for transforming input data into the batcher's chain format. Hence, the batcher needed to be aware of its input types, although it would not otherwise use this information.

This change drops the Input and C type parameters from MergeBatcher, and the Input associated type plus push_container method from the Batcher trait. Batchers now accept chunks via PushInto<Self::Output>. Chunking moves into arrange_core, which gains a Chu: ContainerBuilder type parameter so callers can supply a chunker that maps the stream's input container into the batcher's output container.

The Arrange trait constrains Ba::Output = C (same-type chunker) and hardcodes ContainerChunker<C> internally, so .arrange::<Ba, Bu, Tr>() callsites for Vec-based collections are unchanged. Callers that need a cross-container chunker (columnar layouts, interactive) drop to arrange_core directly.

Also updates chainless_batcher::Batcher to the new Batcher trait shape.

frankmcsherry and others added 27 commits March 25, 2026 13:22
* Use containers for interesting (keys, time)

* Remove use of owned keys in reduce.rs

* Remove all uses of KeyOwn

* Remove KeyOwn

* Improve documentation

* Walk back overly prescriptive dogs^3 constraints
Remove TimelyStack, TStack layout, ColumnationChunker, ColInternalMerger,
and all Col* type aliases (ColValSpine, ColKeySpine, ColValBuilder, etc.)
from the codebase. The columnation crate dependency is retained.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…taflow#714)

* Track timely's Child changes

* Tidy bounds

* Absorb next wave of changes

* Remove unused, and Arranged impls

* Remove timestamp generic from Arranged

* Standardize T for timestamps, Tr for traces

* Correct local timely reference

* Convert G generics to T

* Remove as Timestamp prompts

* Remove T generics bound by Tr

* Remove Scope::clone calls

* Further tightening of traits

* Correct docs
frankmcsherry and others added 6 commits April 11, 2026 15:34
* Bring core DD into line

* Clean up examples and tests
* Example DDIR with SCC

* Parse from concrete syntax

* Further examples and improvements

* Further improvements to track columnar

* Move DDIR to its own crate

Extract the DD IR interpreter into a `ddir/` crate, separating shared
infrastructure (parse, IR, lower) from backend-specific rendering.

- Split parse into `parse/applicative.rs` (.ddir) and `parse/pipe.rs` (.ddp)
- Extract `ir.rs` (Node, Program, RowLike) from `lower.rs`
- Move programs and examples from differential-dataflow/examples/ to ddir/
- Remove DDIR_DEBUG, DDIR_REACHABILITY, DDIR_PRINT env vars (use INSPECT)
- Fix off-by-one in columnar harness timestamping
- Unify inspect output format between vec and col backends
- Update programs to output compact counts via arrange+inspect

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Optimized IR

* Relocate to interactive/

* Further clean-up

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The chunker was part of the batcher and responsible for transforming input
data into the batcher's chain format. Hence, the batcher needed to be aware
of its input types, although it would not otherwise use this information.

Drop the `Input` and `C` type parameters from `MergeBatcher`, and the
`Input` associated type plus `push_container` method from the `Batcher`
trait. Batchers now accept chunks via `PushInto<Self::Output>`. Chunking
moves into `arrange_core`, which gains a `Chu: ContainerBuilder` type
parameter so callers can supply a chunker that maps the stream's input
container into the batcher's output container.

The `Arrange` trait constrains `Ba::Output = C` (same-type chunker) and
hardcodes `ContainerChunker<C>` internally, so `.arrange::<Ba, Bu, Tr>()`
callsites for `Vec`-based collections are unchanged. Callers that need a
cross-container chunker (columnar layouts, interactive) drop to
`arrange_core` directly.

Also updates `chainless_batcher::Batcher` to the new `Batcher` trait
shape, and replaces `batcher.push_container(&mut vec\![..])` with
`batcher.push_into(vec\![..])` in the trace test.

Signed-off-by: Moritz Hoffmann <antiguru@gmail.com>
@antiguru antiguru changed the base branch from master to master-next April 17, 2026 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants