Skip to content

Introduce _inter and _intra arrangement variants#723

Draft
frankmcsherry wants to merge 3 commits intoTimelyDataflow:master-nextfrom
frankmcsherry:trace_intra_inter_split
Draft

Introduce _inter and _intra arrangement variants#723
frankmcsherry wants to merge 3 commits intoTimelyDataflow:master-nextfrom
frankmcsherry:trace_intra_inter_split

Conversation

@frankmcsherry
Copy link
Copy Markdown
Member

@frankmcsherry frankmcsherry commented Apr 18, 2026

This is a rework of #687 which means to demonstrate how one can use timely's FrontierInterest to reduce the volume of scheduling that hits various operators. It reworks TraceAgent as two types TraceIntra and TraceInter, the former more efficient to schedule but not shareable across dataflows, and the latter busier to schedule and shareable across dataflows. The distinction is that the latter needs to be able to mirror the frontier into the other dataflow explicitly, as progress tracking does not span dataflows, and to do this it needs to be continually rescheduled as its frontiers evolve.

At the moment, there are mixed defaults of TraceIntra from arrangements with _inter variants to produce TraceInter versions, and TraceInter from reduce_trace which probably wants the same treatment as the arrangement functions.

There is an additional question about which other operators should gain FrontierInterest::IfCapability.

  1. reduce_trace seems like a good candidate. It is very analogous to arrange_core.
  2. join_traces would be great, but there is a catch: it uses its frontiers to relax logical compaction in the other inputs. This is a non-problem for reduce_trace, because in the absence of further inputs (which prompt scheduling) there is nothing to do. But join_traces could have one frontier advance and the other not, and it could hypothetically unlock further compaction. I'm not certain it is a problem, by analogy to reduce_trace: if there's no data from either of them, what compaction is there to perform? But it is a bit tenuous.
  3. half_join also needs some thought. It also relaxes frontiers, and it does receive batch inputs from the trace (for scheduling), even though it ignores them. So potentially fine for the same reasons.

Each of these are subtle though, and trace compaction is something I've gotten wrong before.

@frankmcsherry frankmcsherry changed the base branch from master to master-next April 18, 2026 21:10
@frankmcsherry frankmcsherry force-pushed the trace_intra_inter_split branch 2 times, most recently from 98da6cc to 1a0b84c Compare April 18, 2026 21:16
@frankmcsherry frankmcsherry force-pushed the trace_intra_inter_split branch from 1a0b84c to 02cbc8c Compare April 18, 2026 23:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant