WIP Cursor pagination task instances#64845
Draft
pierrejeambrun wants to merge 4 commits intoapache:mainfrom
Draft
WIP Cursor pagination task instances#64845pierrejeambrun wants to merge 4 commits intoapache:mainfrom
pierrejeambrun wants to merge 4 commits intoapache:mainfrom
Conversation
- Add cursor-based (keyset) pagination as an alternative to offset-based pagination on the get_task_instances endpoint. Offset pagination remains the default and is not deprecated globally. - Response uses a discriminated union: offset responses include total_entries, cursor responses include next_cursor and previous_cursor. - Refactor SortParam to lazily cache column resolution instead of mutating state in to_orm. - Move cursor helpers (encode/decode/apply) to dedicated common/db/cursors.py module. - Cleanly separate cursor vs offset code paths in the endpoint handler.
- Remove order_by from cursor token (now just a list of values) - Support empty string cursor for first page (no fake sentinel needed) - Drop order_by consistency check between cursor and query param
Member
Author
|
WIP, needs some cleaning and a few iterations. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
related: #62027
Performance improvement. Add API support for the
get_task_instancesendpoint for 'cursor' based pagination.At scale, just counting the number of rows is taking too long (60s because the joins prevent using the indexes and trigger parallel full scans, also just a bare count on the table is 2-18s, which is too long). Switching to cursor based pagination will allow to plug the UI on this and reduce this delay. Getting a page takes now <100ms while it was taking 60s on my local setup with 40M TIs.
This keeps backward compatibility, defaulting the list endpoint to 'offset' based pagination. While still allowing passing down a cursor to get a cursor based one.
Was generative AI tooling used to co-author this PR?
Generated-by: [Cursor] following the guidelines
{pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.