MDEV-38970 Streaming Window Functions#4835
Draft
EslamAhmed171 wants to merge 1 commit intoMariaDB:mainfrom
Draft
MDEV-38970 Streaming Window Functions#4835EslamAhmed171 wants to merge 1 commit intoMariaDB:mainfrom
EslamAhmed171 wants to merge 1 commit intoMariaDB:mainfrom
Conversation
1811300 to
2d9fe39
Compare
Member
|
We mark work in progress PRs as "drafts". Please move to "open" when you're ready for review. |
Author
Thanks for the comment! |
2d9fe39 to
0e6f9fb
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This PR does two things at once: it implements Step 1 of MDEV-38970 and serves as my GSoC 2026 proposal. The idea is to show I understand the codebase well enough to implement the minimal subset now, with a clear plan to extend it over the summer.
Problem
Right now MariaDB creates a temp table for every query that has a window function, even when the answer could be computed one row at a time. This means
LIMITdoes nothing useful — all rows get processed before the first result is sent back:Demo
Handler_read_next = 2— only 3 rows read via index scanHandler_read_rnd_next = 0— zero temp table readsCreated_tmp_tables = 0— no temp table createdEXPLAINshowstype: index,key: PRIMARY,Extra: (empty)— noUsing temporaryHow It Works
This implements idea #2 from MDEV-38970:
Instead of buffering everything into a temp table and running
compute_window_func()over all rows, eligible queries now stream results directly to the client:The key insight is that functions like
RANK()andROW_NUMBER()only need to remember a counter and the previous row's key value — there is no reason to buffer anything when the rows already arrive in the right order via an index.Supported Cases
ROW_NUMBER() OVER ()— no ORDER BYROW_NUMBER() OVER (ORDER BY indexed_col)RANK() OVER (ORDER BY indexed_col)DENSE_RANK() OVER (ORDER BY indexed_col)1 + rank() OVER (ORDER BY id)Not Yet Supported
PARTITION BY— streamable with covering index on (partition_col, order_col)ORDER BYvia filesort — saves compute pass but not the sort itselfORDER BYalongside streaming window functionsSUM/COUNT/AVG OVER (ROWS UNBOUNDED PRECEDING)SUM OVER (ROWS BETWEEN N PRECEDING AND CURRENT ROW)FIRST_VALUE(),LAST_VALUE(),NTH_VALUE()etc.Streaming Eligibility Conditions (Step 1)
A query takes the streaming path when all of these hold:
ORDER BYorGROUP BYPARTITION BYROW_NUMBER,RANK, orDENSE_RANKSUM,COUNTetc.) in the same queryOVER()with no ORDER BY, or ORDER BY is covered by an existing index