Skip to content

Early instant DDL attempt#1648

Draft
hasan-dot wants to merge 3 commits intomasterfrom
hasan-dot/early-instant-ddl
Draft

Early instant DDL attempt#1648
hasan-dot wants to merge 3 commits intomasterfrom
hasan-dot/early-instant-ddl

Conversation

@hasan-dot
Copy link
Member

@hasan-dot hasan-dot commented Mar 17, 2026

Related issue: https://github.com/github/issues-platform/issues/1128

Description

This PR moves the --attempt-instant-ddl check to run before ghost table and binlog streaming setup.

  • contributed code is using same conventions as original code
  • script/cibuild returns with no formatting errors, build errors or unit test errors.

Attempt instant DDL early, before ghost table setup

Problem

When --attempt-instant-ddl is enabled, gh-ost currently performs all the heavy setup work before even trying instant DDL:

  1. Start binlog streaming
  2. Create changelog table
  3. Create ghost table + ALTER it
  4. Then attempt ALGORITHM=INSTANT on the original table
  5. If it succeeds → clean up the ghost/changelog tables that were just created

For large tables this wastes significant time and resources on setup that gets thrown away when instant DDL succeeds. On extremely large tables (billions of rows), the unnecessary ghost table creation and binlog streaming initialization adds meaningful overhead.

Solution

Move the instant DDL attempt to right after initiateInspector() (which discovers the master connection config) but before initiateStreaming() and initiateApplier(). If instant DDL succeeds, the migration completes immediately without ever creating ghost tables, changelog tables, or starting binlog streaming.

Benchmark results

Benchmarked on M1 Max with MySQL 8.0.42 in Docker (3 iterations each):

Scenario Instant DDL (early) Full Row Copy Speedup
Empty table 25ms 1,123ms 45x
10k rows 27ms 2,659ms 99x
100k rows 38ms 3,622ms 95x

Instant DDL is ~25-38ms regardless of table size — it's a metadata-only operation. The speedup grows with data size, and on production tables with millions of rows this would be the difference between milliseconds and hours.

Benchmark gist: https://gist.github.com/hasan-dot/23697ec67401348f7c4d64f60212374b

CLI flag

Flag Description
--attempt-instant-ddl (existing, behavior changed) Try instant DDL early before ghost table setup, fall back to row copy

Migration flow comparison

Before (current):

graph LR
    A[Inspector] --> B[Binlog Streaming]
    B --> C[Ghost Table + Changelog]
    C --> D{Attempt instant DDL}
    D -->|success| E[Done]
    D -->|fail| F[Row Copy]
Loading

After (this PR):

graph LR
    A[Inspector] --> B{Attempt instant DDL}
    B -->|success| C[Done]
    B -->|fail| D[Binlog Streaming]
    D --> E[Ghost Table + Changelog]
    E --> F[Row Copy]
Loading

Instant DDL operations (MySQL 8.0+)

These operations complete in milliseconds regardless of table size (source):

  • Adding a column (at end: 8.0.12+; any position: 8.0.29+)
  • Dropping a column
  • Renaming a column
  • Adding/dropping virtual generated columns
  • Setting or dropping a column default value
  • Modifying the definition of an ENUM or SET column
  • Changing the index type
  • Renaming a table

Testing

  • go build ./... passes
  • go vet ./... passes
  • New integration tests pass (testcontainers + MySQL 8.0)
  • Benchmarks confirm expected speedup

Move the --attempt-instant-ddl check to run before ghost table and binlog
streaming setup. If instant DDL succeeds, the migration completes immediately
without creating ghost tables, changelog tables, or starting binlog streaming.

Add --force-instant-ddl flag that aborts the migration if ALGORITHM=INSTANT
is not supported, preventing accidental multi-hour row-copy migrations when
the intent was an instant metadata change.
Remove the now-unused AttemptInstantDDL() method from Applier since instant
DDL is handled by attemptInstantDDLEarly() in the Migrator.

Update command-line-flags.md to document the new early execution behavior of
--attempt-instant-ddl and add documentation for --force-instant-ddl.

Add localtests/force-instant-ddl with an instant-compatible ALTER to exercise
the --force-instant-ddl success path.
@hasan-dot hasan-dot force-pushed the hasan-dot/early-instant-ddl branch from e9f414b to 4531aaf Compare March 17, 2026 20:59
@hasan-dot hasan-dot changed the title Early instant DDL attempt + --force-instant-ddl flag Early instant DDL attempt Mar 17, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the --attempt-instant-ddl flow by attempting ALGORITHM=INSTANT immediately after inspector initialization (master discovery) and before starting binlog streaming or creating ghost/changelog tables, reducing wasted setup work when instant DDL is possible.

Changes:

  • Move the instant DDL attempt earlier in Migrator.Migrate() and add a new attemptInstantDDLEarly() helper.
  • Remove Applier.AttemptInstantDDL() (instant attempt no longer depends on applier initialization).
  • Add an integration test to assert instant DDL completes without creating ghost/changelog tables, and update flag documentation to reflect the new behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
go/logic/migrator.go Attempts instant DDL before streaming/applier setup; adds attemptInstantDDLEarly() and removes the old late attempt block.
go/logic/applier.go Removes the previous AttemptInstantDDL() method from the applier.
go/logic/migrator_test.go Adds an integration test verifying early instant DDL skips ghost/changelog creation.
doc/command-line-flags.md Documents the updated “attempt instant DDL early” behavior and motivation.
Comments suppressed due to low confidence (1)

go/logic/migrator_test.go:430

  • This SHOW TABLES ... LIKE pattern uses '_' which is a single-character wildcard in SQL LIKE, so it doesn’t strictly mean “table name starts with underscore”. To make the assertion precise and avoid accidental matches, escape underscores in the pattern (or add an explicit ESCAPE clause).
	//nolint:execinquery
	err = suite.db.QueryRow("SHOW TABLES IN test LIKE '_testing_ghc'").Scan(&tableName)
	suite.Require().Error(err, "changelog table should not exist after instant DDL")
	suite.Require().Equal(gosql.ErrNoRows, err)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1038 to +1046
// Open a temporary connection to the master for the instant DDL attempt.
// This avoids initializing the full Applier (ghost table, changelog, etc.).
connConfig := this.migrationContext.ApplierConnectionConfig
uri := connConfig.GetDBUri(this.migrationContext.DatabaseName)
db, _, err := mysql.GetDB(this.migrationContext.Uuid, uri)
if err != nil {
this.migrationContext.Log.Infof("Could not open connection for instant DDL attempt: %s", err)
return err
}
if err := this.attemptInstantDDLEarly(); err == nil {
return nil
} else {
this.migrationContext.Log.Infof("ALGORITHM=INSTANT not supported for this operation, proceeding with original algorithm")
Comment on lines +422 to +428
err = suite.db.QueryRow("SHOW TABLES IN test LIKE '_testing_gho'").Scan(&tableName)
suite.Require().Error(err, "ghost table should not exist after instant DDL")
suite.Require().Equal(gosql.ErrNoRows, err)

// Verify that NO changelog table was created
//nolint:execinquery
err = suite.db.QueryRow("SHOW TABLES IN test LIKE '_testing_ghc'").Scan(&tableName)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants