Skip to content
/ server Public

MDEV-38970 Streaming Window Functions#4835

Draft
EslamAhmed171 wants to merge 1 commit intoMariaDB:mainfrom
EslamAhmed171:MDEV-38970-streaming-window-functions
Draft

MDEV-38970 Streaming Window Functions#4835
EslamAhmed171 wants to merge 1 commit intoMariaDB:mainfrom
EslamAhmed171:MDEV-38970-streaming-window-functions

Conversation

@EslamAhmed171
Copy link

@EslamAhmed171 EslamAhmed171 commented Mar 22, 2026

Overview

This PR does two things at once: it implements Step 1 of MDEV-38970 and serves as my GSoC 2026 proposal. The idea is to show I understand the codebase well enough to implement the minimal subset now, with a clear plan to extend it over the summer.

Problem

Right now MariaDB creates a temp table for every query that has a window function, even when the answer could be computed one row at a time. This means LIMIT does nothing useful — all rows get processed before the first result is sent back:

-- Before: reads ALL 10 rows into temp table, LIMIT applied too late
SELECT rank() OVER (ORDER BY id) FROM t1 LIMIT 3;
Handler_read_rnd_next = 22  (10 writes + 10 reads + 2 EOF)

-- After: reads exactly 3 rows, LIMIT stops the storage engine scan
SELECT rank() OVER (ORDER BY id) FROM t1 LIMIT 3;
Handler_read_rnd_next = 0   (index scan, 3 rows only)

Demo

DROP DATABASE IF EXISTS mydatabase;
CREATE DATABASE mydatabase;
USE mydatabase;

CREATE TABLE t1 (id INT PRIMARY KEY AUTO_INCREMENT, val INT);
INSERT INTO t1 (val) VALUES (10),(20),(30),(40),(50),(60),(70),(80),(90),(100);

FLUSH STATUS;
SELECT rank() OVER (ORDER BY id) AS rnk, id FROM t1 LIMIT 3;
SHOW STATUS LIKE 'Handler_read%';
SHOW STATUS LIKE 'Created_tmp_tables';
EXPLAIN SELECT rank() OVER (ORDER BY id) AS rnk FROM t1 LIMIT 3;
  • Handler_read_next = 2 — only 3 rows read via index scan
  • Handler_read_rnd_next = 0 — zero temp table reads
  • Created_tmp_tables = 0 — no temp table created
  • EXPLAIN shows type: index, key: PRIMARY, Extra: (empty) — no Using temporary
image

How It Works

This implements idea #2 from MDEV-38970:

Item_window_func::val_int() can invoke window_func()->val_int() directly without a temporary table.

Instead of buffering everything into a temp table and running compute_window_func() over all rows, eligible queries now stream results directly to the client:

Old: scan → write to T_tmp → compute_window_func() → read T_tmp → send
New: scan → advance state O(1) → send directly → LIMIT stops scan

The key insight is that functions like RANK() and ROW_NUMBER() only need to remember a counter and the previous row's key value — there is no reason to buffer anything when the rows already arrive in the right order via an index.

Supported Cases

  • ROW_NUMBER() OVER () — no ORDER BY
  • ROW_NUMBER() OVER (ORDER BY indexed_col)
  • RANK() OVER (ORDER BY indexed_col)
  • DENSE_RANK() OVER (ORDER BY indexed_col)
  • Multiple window functions with same ORDER BY
  • Multiple window functions where one ORDER BY is a prefix of another
  • Expressions containing streaming window functions e.g. 1 + rank() OVER (ORDER BY id)

Not Yet Supported

  • PARTITION BY — streamable with covering index on (partition_col, order_col)
  • ORDER BY via filesort — saves compute pass but not the sort itself
  • Global ORDER BY alongside streaming window functions
  • Running aggregates SUM/COUNT/AVG OVER (ROWS UNBOUNDED PRECEDING)
  • Sliding window SUM OVER (ROWS BETWEEN N PRECEDING AND CURRENT ROW)
  • Other streamable window functions FIRST_VALUE(), LAST_VALUE(), NTH_VALUE() etc.

Streaming Eligibility Conditions (Step 1)

A query takes the streaming path when all of these hold:

  1. No global ORDER BY or GROUP BY
  2. No PARTITION BY
  3. Function is one of ROW_NUMBER, RANK, or DENSE_RANK
  4. No aggregate functions (SUM, COUNT etc.) in the same query
  5. No filesort needed — either OVER() with no ORDER BY, or ORDER BY is covered by an existing index
  6. When multiple window functions are present: all ORDER BY specs must be prefixes of the widest one, all covered by a single index

@CLAassistant
Copy link

CLAassistant commented Mar 22, 2026

CLA assistant check
All committers have signed the CLA.

@EslamAhmed171 EslamAhmed171 force-pushed the MDEV-38970-streaming-window-functions branch from 1811300 to 2d9fe39 Compare March 22, 2026 21:44
@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Mar 23, 2026
@gkodinov gkodinov changed the title [WIP] MDEV-38970-streaming-window-functions MDEV-38970-streaming-window-functions Mar 23, 2026
@gkodinov gkodinov marked this pull request as draft March 23, 2026 12:15
@gkodinov
Copy link
Member

We mark work in progress PRs as "drafts". Please move to "open" when you're ready for review.

@EslamAhmed171
Copy link
Author

EslamAhmed171 commented Mar 23, 2026

We mark work in progress PRs as "drafts". Please move to "open" when you're ready for review.

Thanks for the comment!
I just wanted to see how CI behaves on my code changes (I thought CI only runs on open PRs).
It already caught a non-deterministic test case mistakenly written by me not by copilot

@EslamAhmed171 EslamAhmed171 force-pushed the MDEV-38970-streaming-window-functions branch from 2d9fe39 to 0e6f9fb Compare March 23, 2026 15:29
@EslamAhmed171 EslamAhmed171 changed the title MDEV-38970-streaming-window-functions MDEV-38970 Streaming Window Functions Mar 23, 2026
@vuvova vuvova requested a review from spetrunia March 23, 2026 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

3 participants