Skip to content

feat: Add generic database connection mechanism#1163

Open
sbernauer wants to merge 24 commits intomainfrom
spike/generic-databases
Open

feat: Add generic database connection mechanism#1163
sbernauer wants to merge 24 commits intomainfrom
spike/generic-databases

Conversation

@sbernauer
Copy link
Member

@sbernauer sbernauer commented Mar 2, 2026

Description

Part of stackabletech/issues#238

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes

Author

  • Changes are OpenShift compatible
  • CRD changes approved
  • CRD documentation for all fields, following the style guide.
  • Integration tests passed (for non trivial changes)
  • Changes need to be "offline" compatible

Reviewer

  • Code contains useful comments
  • Code contains useful logging statements
  • (Integration-)Test cases added
  • Documentation added or updated. Follows the style guide.
  • Changelog updated
  • Cargo.toml only contains references to git tags (not specific commits or branches)

Acceptance

  • Feature Tracker has been updated
  • Proper release label has been added

@sbernauer sbernauer marked this pull request as ready for review March 4, 2026 14:23
#[derive(Clone, Debug, Deserialize, JsonSchema, PartialEq, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct GenericCeleryDatabaseConnection {
/// The name of the Secret that contains an `uri` key with the complete SQLAlchemy URI.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied over from https://github.com/stackabletech/airflow-operator/pull/743/changes#r2786318061:

I feel URI is very generic. I think connectionString is a better description?

@sbernauer sbernauer requested review from Techassi and removed request for Techassi March 19, 2026 11:45
@sbernauer sbernauer requested review from a team and maltesander March 19, 2026 11:45
@sbernauer sbernauer self-assigned this Mar 19, 2026
@sbernauer sbernauer moved this to Development: Waiting for Review in Stackable Engineering Mar 19, 2026
@Techassi Techassi self-requested a review March 19, 2026 11:45
@maltesander maltesander moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Mar 23, 2026
Copy link
Member

@Techassi Techassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a partial review, but I left a bunch of comments with my concern and ideas/suggestions

{
self.env
.get_or_insert_with(Vec::new)
.extend(env_vars.into_iter().map(|e| e.borrow().clone()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: This makes zero sense. We make it look like we take an iterator of items which can be borrowed as EnvVar, but they are not actually not only borrowed, but cloned under the hood.

I would go as far as to call this an anti-pattern.

@@ -0,0 +1,20 @@
use k8s_openapi::api::core::v1::{EnvVar, EnvVarSource, SecretKeySelector};

pub fn env_var_from_secret(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I think this is not an appropriate name. We are not constructing an env var from a Kubernetes secret (which would need to be looked up), but we instead prepare the env var in such a way that it will source its value from a Secret (which the Kubernetes apiserver or the kubelet is responsible for).

/// against the MySQL server.
pub credentials_secret: String,

/// Additional map of JDBC connection parameters to append to the connection URL. The given
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I would avoid mentioning JDBC explicitly, because these parameters are not unique to JDBC. Most database connection URI schemas support these extra parameters in some shape or form.


/// Name of a Secret containing the `username` and `password` keys used to authenticate
/// against the MySQL server.
pub credentials_secret: String,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I would like to follow the Kubernetes convention here and call this credentials_secret_name/credentialsSecretName instead.

pub location: Option<String>,
}

impl JDBCDatabaseConnection for DerbyConnection {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: This should be named JdbcDatabaseConnection instead.

Comment on lines +58 to +61
pub fn add_to_container(&self, cb: &mut ContainerBuilder) {
let env_vars = self.username_env.iter().chain(self.password_env.iter());
cb.add_env_vars(env_vars);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: This function name doesn't clearly convey what is being added to the container.

Putting this very bluntly: Add the connection details? What does this even mean? Do we write a file, do we spawn a command which does something, do we add env vars?

Of course I can see what the function does when I look at the body, but just looking at the name doesn't tell me anything. A better name could be add_env_vars_to_container.

)
}

/// Returns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Returns... what? Seems like half of the sentence is missing here.


#[derive(Debug, Snafu)]
pub enum Error {
#[snafu(context(false), display("PostgreSQL error"))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I don't see a reason why we would want/need context(false). If there is a good reason, it should be explained as a dev comment.

note: The error messages seem a little bland. We could at least say what happened, like "failed to construct PostgreSQL connection string/uri".

}

#[derive(Copy, Clone, Debug, Default)]
pub enum TemplatingMechanism {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Do we really need both mechanims? If so, we should describe that in a dev comment.

Comment on lines +126 to +134
pub enum DummyDatabaseConnection {
Postgresql(PostgresqlConnection),
Mysql(MysqlConnection),
Derby(DerbyConnection),
Redis(RedisConnection),
GenericJDBC(GenericJDBCDatabaseConnection),
GenericSQLAlchemy(GenericSQLAlchemyDatabaseConnection),
GenericCelery(GenericCeleryDatabaseConnection),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: Again, I have a whole couple of issues with this:

  1. I think this enum should be provided by the framework, just with the right variants: postgresql, mysql, redis (new), and generic.
  2. All the Generic* variants are misplaced in my opinion. As per ADR, they should be covered by a single generic variant.
  3. The enum itself could provide helper functions to downstream users to construct/get the connection details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Development: In Review

Development

Successfully merging this pull request may close these issues.

3 participants