Add retry logic for instant DDL on lock wait timeout#1651
Merged
meiji163 merged 2 commits intogithub:masterfrom Mar 18, 2026
Merged
Add retry logic for instant DDL on lock wait timeout#1651meiji163 merged 2 commits intogithub:masterfrom
meiji163 merged 2 commits intogithub:masterfrom
Conversation
When attempting instant DDL, a lock wait timeout (errno 1205) may occur if a long-running transaction holds a metadata lock. Rather than failing immediately, retry the operation up to 5 times with linear backoff. Non-timeout errors (e.g. ALGORITHM=INSTANT not supported) still return immediately without retrying.
meiji163
approved these changes
Mar 18, 2026
Contributor
|
Thanks for the PR! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A Pull Request should be associated with an Issue.
Related issue: #1650
Description
This PR adds retry logic for instant DDL when a lock wait timeout (MySQL errno 1205) occurs. When a long-running transaction holds a metadata lock, the instant DDL attempt may fail with a lock wait timeout. Instead of failing immediately, the operation is now retried up to 5 times with linear backoff (5s, 10s, 15s, 20s between attempts). Non-timeout errors such as ALGORITHM=INSTANT is not supported still return immediately without retrying.
A new retryOnLockWaitTimeout function is introduced separately from the existing retryOperation in the migrator, because it requires error-discriminating behavior (only retry on errno 1205) and should not trigger PanicAbort on failure.
Unit tests cover all branches: success on first attempt, retry then succeed, non-retryable MySQL error, non-MySQL error, and retry exhaustion.
script/cibuildreturns with no formatting errors, build errors or unit test errors.