Skip to content

WIP Cursor pagination task instances#64845

Draft
pierrejeambrun wants to merge 4 commits intoapache:mainfrom
astronomer:feature/cursor-pagination-task-instances
Draft

WIP Cursor pagination task instances#64845
pierrejeambrun wants to merge 4 commits intoapache:mainfrom
astronomer:feature/cursor-pagination-task-instances

Conversation

@pierrejeambrun
Copy link
Copy Markdown
Member

@pierrejeambrun pierrejeambrun commented Apr 7, 2026

related: #62027

Performance improvement. Add API support for the get_task_instances endpoint for 'cursor' based pagination.

At scale, just counting the number of rows is taking too long (60s because the joins prevent using the indexes and trigger parallel full scans, also just a bare count on the table is 2-18s, which is too long). Switching to cursor based pagination will allow to plug the UI on this and reduce this delay. Getting a page takes now <100ms while it was taking 60s on my local setup with 40M TIs.

This keeps backward compatibility, defaulting the list endpoint to 'offset' based pagination. While still allowing passing down a cursor to get a cursor based one.


Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

Generated-by: [Cursor] following the guidelines


  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

- Add cursor-based (keyset) pagination as an alternative to offset-based
  pagination on the get_task_instances endpoint. Offset pagination remains
  the default and is not deprecated globally.
- Response uses a discriminated union: offset responses include
  total_entries, cursor responses include next_cursor and previous_cursor.
- Refactor SortParam to lazily cache column resolution instead of
  mutating state in to_orm.
- Move cursor helpers (encode/decode/apply) to dedicated
  common/db/cursors.py module.
- Cleanly separate cursor vs offset code paths in the endpoint handler.
- Remove order_by from cursor token (now just a list of values)
- Support empty string cursor for first page (no fake sentinel needed)
- Drop order_by consistency check between cursor and query param
@pierrejeambrun pierrejeambrun added this to the Airflow 3.2.1 milestone Apr 7, 2026
@pierrejeambrun pierrejeambrun self-assigned this Apr 7, 2026
@pierrejeambrun pierrejeambrun marked this pull request as draft April 7, 2026 14:54
@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:UI Related to UI/UX. For Frontend Developers. labels Apr 7, 2026
@pierrejeambrun
Copy link
Copy Markdown
Member Author

WIP, needs some cleaning and a few iterations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:airflow-ctl area:API Airflow's REST/HTTP API area:UI Related to UI/UX. For Frontend Developers.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant