Skip to content

feat: daemonset maxsurge to prevent unavailability on config changes#819

Open
dervoeti wants to merge 5 commits intomainfrom
feat/daemonset-maxsurge
Open

feat: daemonset maxsurge to prevent unavailability on config changes#819
dervoeti wants to merge 5 commits intomainfrom
feat/daemonset-maxsurge

Conversation

@dervoeti
Copy link
Member

@dervoeti dervoeti commented Mar 23, 2026

Description

Solves the problem reported in #817

Set maxSurge=1 and maxUnavailable=0 on the OPA DaemonSet so that during rolling updates, the new OPA pod is created and becomes ready before the old pod is terminated. This eliminates the availability gap that causes errors in products like Trino when OPA config changes trigger a DaemonSet restart.

DaemonSet maxSurge graduated to GA in Kubernetes 1.25, our current minimum supported Kubernetes version is 1.31.

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes

Author

  • Changes are OpenShift compatible
  • CRD changes approved
  • CRD documentation for all fields, following the style guide.
  • Helm chart can be installed and deployed operator works
  • Integration tests passed (for non trivial changes)
  • Changes need to be "offline" compatible
  • Links to generated (nightly) docs added
  • Release note snippet added

Reviewer

  • Code contains useful comments
  • Code contains useful logging statements
  • (Integration-)Test cases added
  • Documentation added or updated. Follows the style guide.
  • Changelog updated
  • Cargo.toml only contains references to git tags (not specific commits or branches)

Acceptance

  • Feature Tracker has been updated
  • Proper release label has been added
  • Links to generated (nightly) docs added
  • Release note snippet added
  • Add type/deprecation label & add to the deprecation schedule
  • Add type/experimental label & add to the experimental features tracker

@dervoeti
Copy link
Member Author

Release note

The OPA DaemonSet now uses maxSurge=1 and maxUnavailable=0 for its rolling update strategy. During rolling updates (e.g. triggered by OPA config changes), a new OPA pod is created and must become ready before the old pod is terminated. This eliminates the brief availability gap that previously caused errors in products like Trino when the local OPA pod was unavailable during restarts.

@dervoeti dervoeti self-assigned this Mar 23, 2026
@dervoeti dervoeti moved this to Development: Waiting for Review in Stackable Engineering Mar 23, 2026
@dervoeti dervoeti changed the title Feat/daemonset maxsurge feat: daemonset maxsurge to prevent unavailability on config changes Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Development: Waiting for Review

Development

Successfully merging this pull request may close these issues.

1 participant