Skip to content
/ server Public

DuckDB as a storage engine#4830

Open
drrtuy wants to merge 3 commits intoMariaDB:12.3from
drrtuy:duckdb-engine-2
Open

DuckDB as a storage engine#4830
drrtuy wants to merge 3 commits intoMariaDB:12.3from
drrtuy:duckdb-engine-2

Conversation

@drrtuy
Copy link
Contributor

@drrtuy drrtuy commented Mar 20, 2026

DuckDB Storage Engine for MariaDB

Overview

This PR introduces DuckDB 1.3.2 as an embedded analytical (OLAP) storage engine plugin for MariaDB 12.3. The engine is ported from AliSQL's DuckDB integration and fully adapted to MariaDB's handler API, plugin system, build infrastructure, and packaging conventions. It ships as a loadable module (ha_duckdb.so) for x86_64 and ARM64 architectures.

Architecture

The engine is structured as a git submodule at storage/duckdb/duckdb (repo: drrtuy/duckdb-engine), which itself contains a nested submodule (third_parties/duckdb) pointing to a DuckDB 1.3.2 fork with MariaDB-specific patches (e.g. octet_length VARCHAR overload). DuckDB is built statically with all builtin extensions (ICU, JSON, Parquet, jemalloc, etc.) and linked into a single ha_duckdb.so plugin. A debug-STL ABI mismatch guard (-U_GLIBCXX_DEBUG -U_GLIBCXX_ASSERTIONS) is applied in debug builds to prevent SIGSEGV from sizeof(std::vector) divergence between the plugin and the server.

Supported Operations

  • DDL: CREATE TABLE, DROP TABLE, ALTER TABLE, RENAME TABLE — SQL is translated from MariaDB's internal structures to DuckDB-dialect DDL via ddl_convertor.cc, including expression defaults, column types, and engine conversions (ALTER TABLE ... ENGINE=DuckDB).
  • DML — INSERT: Batch ingestion through DuckDB's Appender API with delta temp tables (delta_appender.cc), supporting all major MariaDB column types.
  • DML — SELECT: Full query pushdown to DuckDB via MariaDB's select_handler interface (ha_duckdb_pushdown.cc), with result-set conversion back to MariaDB row format (duckdb_select.cc).
  • DML — UPDATE/DELETE: Direct UPDATE and DELETE translated to DuckDB SQL (dml_convertor.cc).
  • Configuration: DuckDB-specific server variables — memory limit, thread count, operating mode, etc. (duckdb_config.cc).
  • Timezone & Charset: Mapping between MariaDB and DuckDB timezone names and charset/collation conventions.
  • UDFs: Registration of DuckDB-side user-defined functions accessible from MariaDB.
  • Per-thread context: DuckdbThdContext managed via thd_get_ha_data/thd_set_ha_data (replacing AliSQL's THD::get_duckdb_context()).

Packaging

  • New Debian package mariadb-plugin-duckdb for amd64 and arm64.
  • postinst / prerm hooks automatically run install.sql / uninstall.sql to register or unregister the plugin when the server is running.
  • PLUGIN_DUCKDB=NO by default in native Debian builds (debian/rules); enabled automatically by autobake-deb.sh for MariaDB.org release builds when the submodule is present.
  • RPM packaging scaffolding included in the submodule.

Testing

An MTR test suite is included under mysql-test/plugin/duckdb/ covering DDL, DML, type conversions, ALTER operations, and basic query pushdown scenarios. A disable.def file documents tests that are not yet passing or are intentionally skipped.

Key Differences from AliSQL Port

  • Removed MySQL 8 Data Dictionary (dd::Table, dd::Schema) dependency — table metadata comes from TABLE* and .frm / TABLE_SHARE.
  • Replaced mysql_declare_plugin with maria_declare_plugin.
  • Adapted all handler method signatures (no dd::Table* parameters).
  • Removed partition support, replication batch mode (GTID batch, multi-trx), and Relay_log_info extensions (not applicable to MariaDB).
  • Uses DB_TYPE_AUTOASSIGN instead of the AliSQL-specific DB_TYPE_DUCKDB enum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants