Skip to content

feat: Add KMS signer backend#3171

Open
alpe wants to merge 13 commits intomainfrom
alex/3163_kms
Open

feat: Add KMS signer backend#3171
alpe wants to merge 13 commits intomainfrom
alex/3163_kms

Conversation

@alpe
Copy link
Contributor

@alpe alpe commented Mar 17, 2026

Resolves #3163

Overview

Includes AWS and GCP remote signers

Manual test:

See:

Example
Setup AWS permission:

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Sid": "AllowKeyCreation",
			"Effect": "Allow",
			"Action": [
				"kms:CreateKey",
				"kms:TagResource",
				"kms:EnableKey",
				"kms:PutKeyPolicy",
				"kms:GetPublicKey",
				"kms:Sign",
				"kms:ListKeys",
				"kms:ListAliases"

			],
			"Resource": "*"
		}
	]
}

Create KMS key:

 aws kms create-key \ 
  --description "ev-node signer" \
  --key-usage SIGN_VERIFY \
  --key-spec ECC_NIST_EDWARDS25519

Copy KeyId from response.

export EVNODE_E2E_AWS_KMS_KEY_ID=<keyId>
export EVNODE_E2E_AWS_KMS_REGION=
cd test/e2e
go test -v -tags e2e,evm -run TestEvmSequencerWithAWSKMSSignerE2E -count=1 --evm-binary=$(pwd)/../../build/testapp

Summary by CodeRabbit

  • New Features

    • AWS and GCP KMS signer backends — run ev-node with signer_type=kms and provider-specific options; init/start support for KMS signers.
  • Configuration

    • New CLI/config flags for KMS: provider, AWS key/region/profile, GCP key/credentials, per-request timeout and max-retries.
  • Documentation

    • Added detailed how-to guides for AWS and GCP KMS signer setup, expected startup logs, configuration constraints, and troubleshooting.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 17, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Added KMS-backed signer support (AWS & GCP), made Signer.Sign context-aware, introduced a signer factory, wired KMS signers into init/run and block/DA signing flows, added docs/tests/E2E, and updated dependencies and many call sites to propagate context.

Changes

Cohort / File(s) Summary
Config & Flags
pkg/config/config.go, pkg/config/defaults.go, pkg/config/config_test.go
Add KMS signer config structs, CLI flags, defaults, and validation for AWS/GCP (provider, key id/name, timeout>0, max_retries>=0); expand flag tests.
Signer API & Local Implementations
pkg/signer/signer.go, pkg/signer/file/..., pkg/signer/noop/..., pkg/signer/file/*_test.go, pkg/signer/file/doc.go, pkg/signer/file/README.md
Change Signer.Sign to Sign(ctx context.Context, ...); update file/noop implementations, constructors, examples, and tests to accept and respect context.
KMS Signer Implementations
pkg/signer/aws/signer.go, pkg/signer/gcp/signer.go, pkg/signer/aws/*_test.go, pkg/signer/gcp/*_test.go, pkg/signer/{aws,gcp}/README.md
Add AWS and GCP KMS signer packages: clients, constructors, Sign/GetPublic/GetAddress, per-call timeout, retry/backoff, DER/CRC checks, caching, and comprehensive unit tests and docs.
Signer Factory & Init/Run Integration
pkg/signer/factory.go, pkg/signer/factory_test.go, pkg/cmd/init.go, pkg/cmd/init_test.go, pkg/cmd/run_node.go, apps/*/cmd/init.go
Introduce factory functions NewSigner/NewSignerForInit, wire factory into init/run flows, adjust signer-path/passphrase handling, and propagate contexts during signer creation.
Block & DA Signing Context Propagation
block/internal/executing/executor.go, block/internal/submitting/da_submitter.go, block/internal/submitting/*_test.go
Thread context.Context through ProduceBlock and DA submitter signing paths; update internal helpers and tests to call signer.Sign with ctx.
Types & Test Helpers
types/utils.go, types/*_test.go, pkg/sync/*, pkg/store/*, pkg/sync/sync_service_test.go, types/signed_header_test.go
Add ctx parameters to signed-header and signature helpers; propagate ctx in helpers and many tests, updating call sites to t.Context() or context.Background() where appropriate.
Docs, Guides & Changelog
docs/guides/operations/aws-kms-signer.md, docs/guides/operations/gcp-kms-signer.md, docs/.vitepress/config.ts, CHANGELOG.md
Add AWS/GCP KMS operation guides, docs sidebar entry, and changelog entry for KMS signer.
Dependencies & E2E
go.mod, apps/*/go.mod, test/e2e/go.mod, execution/evm/test/go.mod, test/e2e/evm_kms_e2e_test.go
Add direct/indirect deps for AWS SDK v2, GCP KMS, Google auth libs, OpenTelemetry contrib, bump transitive versions, and add E2E tests for AWS/GCP KMS signers.
Miscellaneous Tests & Small Fixes
pkg/cmd/init.go (apps), pkg/store/store_adapter_test.go, many tests
Small call-site updates, linter annotations, and test adjustments to reflect context-aware Sign and new factory behavior.

Sequence Diagram(s)

sequenceDiagram
    participant App as App (init/run)
    participant Ctx as Cmd Context
    participant Factory as Signer Factory
    participant Config as Config Validator
    participant KMS as KMS Service
    participant Signer as Signer Impl

    App->>Ctx: cmd.Context()
    App->>Factory: NewSignerForInit(ctx, config, passphrase)
    Factory->>Config: validate signer type & KMS config
    alt validation error
        Config-->>Factory: error
        Factory-->>App: error
    else file signer
        Factory->>Signer: Create/load FileSystemSigner(...)
        Signer-->>Factory: FileSystemSigner
    else KMS signer
        Factory->>KMS: NewKmsSigner(ctx, key, opts)
        KMS->>KMS: load creds & client
        KMS->>KMS: GetPublicKey(ctx) -> parse & cache
        KMS-->>Factory: KmsSigner
    end
    Factory-->>App: signer
Loading
sequenceDiagram
    participant Executor as Block Executor
    participant Signer as Signer (File/AWS/GCP)
    participant KMS as KMS Service
    participant Cache as PubKey Cache

    Executor->>Executor: ProduceBlock(ctx)
    Executor->>Signer: Sign(ctx, headerBytes)
    alt File signer
        Signer->>Signer: local ed25519.Sign(...)
        Signer-->>Executor: signature
    else KMS signer
        Signer->>Signer: check ctx.Err()
        Signer->>KMS: Sign(ctx, message) with timeout
        alt retryable error
            KMS->>KMS: exponential backoff & retry
        else non-retryable
            KMS-->>Signer: error
        end
        KMS-->>Signer: signature
        Signer->>Cache: return cached pubKey
        Signer-->>Executor: signature
    end
    Executor->>Executor: continue block flow
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • tac0turtle
  • yarikbratashchuk
  • chatton

Poem

🐰 Up popped a rabbit, keys in the cloud,

No files to guard, just KMS proud,
Context in paws, retries in tune,
AWS and GCP hum a signing tune,
Hoppity hop — the node starts soon!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 34.65% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: Add KMS signer backend' clearly and accurately summarizes the main change in the PR, which is implementing cloud KMS signers for AWS and GCP.
Description check ✅ Passed The PR description includes issue reference and overview sections covering AWS/GCP implementations, but lacks comprehensive detail on design decisions, test coverage, and implementation specifics expected in a complex feature.
Linked Issues check ✅ Passed The PR implements all primary objectives from #3163: KMS signer interface implementations for AWS and GCP, remote signing API integration, public key caching, configuration via signer_type, and Ed25519 support validation.
Out of Scope Changes check ✅ Passed All code changes directly support the KMS signer feature: context threading for signing operations, KMS implementations (AWS/GCP), configuration extensions, factory pattern for signer construction, and comprehensive tests.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch alex/3163_kms

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 17, 2026

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedMar 23, 2026, 4:33 PM

@claude
Copy link
Contributor

claude bot commented Mar 17, 2026

Claude finished @alpe's task in 2m 33s —— View job


Code Review: feat: Add KMS signer backend

  • Read CLAUDE.md / AGENTS.md for repo guidelines
  • Read core changed files (signer interface, AWS signer, GCP signer, factory, config)
  • Read supporting files (tests, docs, cmd integration)
  • Analyze code quality, architecture, error handling, security, performance, testing
  • Post comprehensive review

This is a solid, well-structured PR. The architecture is clean, context propagation is thorough, and the fail-fast public key loading is a nice pattern. Several issues remain from previous review rounds and a few new ones need attention.

Critical / Must-Fix

1. Default timeout mismatch: 1s vs 10s (both AWS and GCP)

pkg/signer/aws/signer.go:33 documents Timeout default as 10s, but kmsSignerFromClient (line 87) falls back to 1s. Same issue in pkg/signer/gcp/signer.go:58-108. One second is aggressive for a remote KMS call and will cause spurious timeouts in normal operation.

// aws/signer.go:87 — should be 10s to match docs
o := Options{Timeout: 1 * time.Second, MaxRetries: 3}

Fix this →

2. GCP fetchPublicKey missing response integrity checks

pkg/signer/gcp/signer.go:133-173 — The GCP fetchPublicKey doesn't validate:

  • out.GetName() matches s.keyName (detects request/response mismatch in-transit)
  • out.GetPemCrc32C() checksum over the returned PEM (detects corruption)

Per GCP KMS data integrity guidelines, both checks are mandatory. The GCP Sign path has good CRC32C coverage; GetPublicKey should match it. A corrupted or mismatched response would be cached as the node's identity.

// Missing after GetPublicKey call:
if out.GetName() != s.keyName {
    return fmt.Errorf("KMS GetPublicKey: name mismatch, got %q want %q", out.GetName(), s.keyName)
}
if out.GetPemCrc32C() == nil {
    return fmt.Errorf("KMS GetPublicKey: pem_crc32c missing")
}
if int64(crc32.Checksum([]byte(out.GetPem()), castagnoliTable)) != out.GetPemCrc32C().GetValue() {
    return fmt.Errorf("KMS GetPublicKey: pem_crc32c mismatch, response corrupted in transit")
}

Fix this →

3. GCP verifySignResponse missing key name check

pkg/signer/gcp/signer.go:245-261verifySignResponse checks CRC32C on the signature but doesn't verify out.GetName() == s.keyName. A response from a different key version would pass validation.


Major Issues

4. AWS unbounded exponential backoff — potential overflow

pkg/signer/aws/signer.go:166:

backoff := time.Duration(100<<uint(attempt-1)) * time.Millisecond

This grows unbounded. At attempt=23, 100<<22 = 419430400ms ≈ 4.8 days. The GCP signer correctly caps at maxRetryBackoff = 5 * time.Second with a proper doubling loop — the AWS signer should do the same.

Fix this →

5. GCP KMS client has no Close() / cleanup path

pkg/signer/gcp/signer.go:90kms.NewKeyManagementClient returns a client with a Close() method. NewKmsSigner owns the client but neither KMSClient interface nor KmsSigner expose shutdown semantics. These gRPC connections accumulate until process exit. For long-running nodes this matters.

The KMSClient interface should include Close() error, and cloudKMSClient / KmsSigner should implement it so the factory or node shutdown path can clean up.

Fix this →


Minor Issues

6. Config struct comment inconsistency (persists from prior review)

pkg/config/config.go:315 has comment:"Type of remote signer to use (file, kms)" but the CLI flag at line 622 says "type of signer to use (file, grpc, kms)". If grpc is not supported, remove it from the flag description. If it is (or will be), add it to the struct comment.

7. GCP GetPublic() inconsistent mutex pattern

pkg/signer/gcp/signer.go:265-275GetPublic manually releases the lock before the nil check, while GetAddress uses defer. Both approaches are correct but inconsistent. Using defer in both is safer:

func (s *KmsSigner) GetPublic() (crypto.PubKey, error) {
    s.mu.RLock()
    defer s.mu.RUnlock()  // consistent with GetAddress
    pubKey := s.pubKey
    if pubKey == nil {
        return nil, fmt.Errorf("public key not loaded")
    }
    return pubKey, nil
}

8. cancel() in Sign should be defer cancel()

pkg/signer/aws/signer.go:181 and pkg/signer/gcp/signer.go:204 — Calling cancel() immediately after the API call works in practice but leaks context on panic. Using defer cancel() is the idiomatic, safe pattern.

9. Redundant flag.Parse() calls in E2E tests

test/e2e/evm_kms_e2e_test.go — The flag.Parse() calls inside test functions are redundant; the Go test harness handles this before tests run. Calling it multiple times can cause issues when both tests run in the same binary execution.

10. GCP credentials API usage

pkg/signer/gcp/signer.go:87:

option.WithAuthCredentialsFile(option.ServiceAccount, opts.CredentialsFile)

The first argument option.ServiceAccount is an auth type constant from a specific API version. Verify this is the intended API — the more common pattern for service account files is option.WithCredentialsFile(opts.CredentialsFile) which auto-detects the key type from the JSON file. Using the typed variant may reject workload identity federation credentials (which don't have type: service_account).


Positive Observations

  • Fail-fast initialization: Eagerly fetching the public key at construction is excellent — it surfaces credential and permission errors at startup rather than during block production.
  • GCP CRC32C integrity: The verifySignResponse function with both VerifiedDataCrc32C and signature checksum is thorough and follows GCP best practices.
  • Address copy in GetAddress: r := make([]byte, len(s.address)); copy(r, s.address) is correct and prevents callers from mutating cached state.
  • Context cancellation in retry loop: Checking ctx.Err() at the top of each iteration and using select on ctx.Done() during backoff is correct.
  • Non-retryable error fast-fail: isRetryableKMSError is comprehensive and the immediate return on non-retryable errors avoids unnecessary delays.
  • Factory pattern: Clean separation between NewSigner (for run) and NewSignerForInit (for init, which may create keys) is a good design.
  • E2E tests: Env-var gating with t.Skip and realistic EVM sequencer setup is solid.

Summary

The implementation is production-quality in structure. The blocking items before merge are: (1) the 1s default timeout needs correcting to match docs in both signers, (2) GCP fetchPublicKey needs response name + PEM CRC32C validation per GCP's data integrity guidelines, and (3) the AWS backoff needs a cap to prevent potential overflow with high MaxRetries. The GCP client Close() gap should also be addressed for long-running node deployments.

@codecov
Copy link

codecov bot commented Mar 17, 2026

Codecov Report

❌ Patch coverage is 73.38129% with 111 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.45%. Comparing base (f9c8717) to head (2a7a93f).

Files with missing lines Patch % Lines
pkg/signer/gcp/signer.go 68.79% 30 Missing and 14 partials ⚠️
pkg/signer/aws/signer.go 67.82% 26 Missing and 11 partials ⚠️
pkg/cmd/run_node.go 38.46% 12 Missing and 4 partials ⚠️
pkg/signer/factory.go 86.36% 3 Missing and 3 partials ⚠️
pkg/cmd/init.go 73.33% 2 Missing and 2 partials ⚠️
block/internal/submitting/da_submitter.go 80.00% 2 Missing ⚠️
pkg/signer/noop/signer.go 33.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3171      +/-   ##
==========================================
+ Coverage   61.14%   61.45%   +0.31%     
==========================================
  Files         117      120       +3     
  Lines       12082    12431     +349     
==========================================
+ Hits         7387     7640     +253     
- Misses       3868     3938      +70     
- Partials      827      853      +26     
Flag Coverage Δ
combined 61.45% <73.38%> (+0.31%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@alpe alpe changed the title feat: Add KMS signer backend feat: Add AWS KMS signer backend Mar 19, 2026
@alpe alpe marked this pull request as ready for review March 19, 2026 13:35
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (3)
types/utils_test.go (1)

82-82: Consider using t.Context() for consistency.

While context.Background() works correctly here, other test files in this PR use t.Context() which is more idiomatic for tests. It automatically cancels when the test completes, providing better cleanup semantics.

🔧 Suggested change
-			firstSignedHeader, err := types.GetFirstSignedHeader(context.Background(), noopSigner, tc.chainID)
+			firstSignedHeader, err := types.GetFirstSignedHeader(t.Context(), noopSigner, tc.chainID)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@types/utils_test.go` at line 82, Replace the call to context.Background()
with the test's cancellable context by using t.Context() when invoking
types.GetFirstSignedHeader; update the specific invocation firstSignedHeader,
err := types.GetFirstSignedHeader(context.Background(), noopSigner, tc.chainID)
to pass t.Context() instead so the test uses the per-test context and cancels
automatically when the test finishes.
pkg/cmd/init_test.go (1)

49-50: Line 49 now overstates the scenario.

This subtest is specifically exercising an unknown signer type (remote), not non-file signers in general. Renaming the case description will keep the test intent aligned with what it actually covers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/cmd/init_test.go` around lines 49 - 50, Update the test case description
to accurately reflect that this subtest exercises an unknown "remote" signer
type: change the comment and/or header text that currently reads "Case 3:
Non-File signer, Aggregator -> Error (unknown signer type)" to something like
"Case 3: Remote signer, Aggregator -> Error (unknown signer type)" so the
t.Run("RemoteSigner_Aggregator", ...) intent matches the comment.
pkg/cmd/run_node.go (1)

140-142: Consider enhancing AWS KMS log with key ID for observability.

Including the key ID (or a truncated version) in the log message would help operators verify which KMS key is in use during startup.

🔧 Optional enhancement
 		if nodeConfig.Signer.SignerType == "awskms" {
-			logger.Info().Msg("initialized AWS KMS signer via factory")
+			logger.Info().
+				Str("kms_key_id", nodeConfig.Signer.KmsKeyID).
+				Msg("initialized AWS KMS signer via factory")
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/cmd/run_node.go` around lines 140 - 142, The AWS KMS startup log only
notes the signer type; update the block that checks nodeConfig.Signer.SignerType
== "awskms" to also read the KMS key identifier from the signer config (e.g.,
nodeConfig.Signer.KeyID or nodeConfig.Signer.AwsKmsKeyID), truncate it for
safety (for example to the first 8 characters) and include that truncated key id
in the logger.Info() message so operators can see which KMS key was used while
avoiding full secret exposure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/config/config.go`:
- Around line 305-306: The SignerType struct tag comment currently lists "(file,
awskms)" but the CLI flag description includes "(file, grpc, awskms)"; make them
consistent by either adding "grpc" to the SignerType field comment (SignerType
string `mapstructure:"signer_type" yaml:"signer_type" comment:"Type of remote
signer to use (file, grpc, awskms)"`) or by removing "grpc" from the CLI flag
help text if grpc is not supported—update the comment associated with SignerType
(and SignerPath documentation if needed) or the flag description in the CLI
registration so both list the same valid signer types.

In `@pkg/signer/aws/README.md`:
- Around line 3-15: Update the README for the awskms signer to add explicit IAM
permissions and KMS key configuration required: state that the IAM principal
needs kms:GetPublicKey and kms:Sign on the configured key, and that the KMS key
must be created with KeyUsage=SIGN_VERIFY and KeySpec=ECC_NIST_EDWARDS25519;
mention that the implementation of signer.Signer calls GetPublicKey eagerly
during initialization (and fails fast on misconfiguration) and uses Sign for
signing operations so those permissions/settings are mandatory for successful
initialization and runtime signing.

In `@pkg/signer/aws/signer.go`:
- Around line 31-37: The documented default Timeout in the Options struct (10s)
is inconsistent with the fallback used in kmsSignerFromClient (1s); update the
implementation to use a single default (e.g., defaultTimeout = 10 * time.Second)
and ensure kmsSignerFromClient and any other fallback logic use that constant
when opts is nil or opts.Timeout <= 0; modify the fallback for MaxRetries
similarly if needed and verify all uses (notably kmsSignerFromClient and the
related sign/constructor paths) reference the same default constants instead of
hardcoded values.
- Around line 207-216: GetAddress currently returns the backing slice s.address
from KmsSigner allowing callers to mutate cached state after the RLock is
released; while holding the read lock (in KmsSigner.GetAddress) allocate a new
byte slice, copy the contents of s.address into it, and return that copy instead
of the original slice (keep the existing nil check and error path, but ensure
the copy is created before releasing s.mu.RUnlock so callers cannot corrupt the
signer's cached identity).

In `@pkg/signer/factory.go`:
- Around line 33-35: Reject empty signer_path in the signer factory before
calling filepath.Abs: retrieve the raw value used (config.Signer.SignerPath or
the trimmed value from strings.TrimSuffix(config.Signer.SignerPath,
"signer.json")), check if it is empty (after trimming) and return a clear error
instead of calling filepath.Abs; only call filepath.Abs when the trimmed signer
path is non-empty and then continue with existing resolution logic that assigns
to signerPath. Ensure the validation happens in the same initialization area
where signerPath is computed (the code using filepath.Abs and signerPath) so
misconfiguration cannot fall back to the process CWD.

In `@types/utils.go`:
- Around line 192-198: GetRandomNextSignedHeader is building
newSignedHeader.Header via GetRandomNextHeader but then calls GetSignature with
the old signedHeader.Header, producing a signature that won't verify; change the
call to GetSignature(ctx, newSignedHeader.Header, signer), assign the returned
signature to newSignedHeader.Signature (or the appropriate field on
SignedHeader), and preserve the existing error check/return so the function
returns the newly built SignedHeader with a signature over its own Header.

---

Nitpick comments:
In `@pkg/cmd/init_test.go`:
- Around line 49-50: Update the test case description to accurately reflect that
this subtest exercises an unknown "remote" signer type: change the comment
and/or header text that currently reads "Case 3: Non-File signer, Aggregator ->
Error (unknown signer type)" to something like "Case 3: Remote signer,
Aggregator -> Error (unknown signer type)" so the
t.Run("RemoteSigner_Aggregator", ...) intent matches the comment.

In `@pkg/cmd/run_node.go`:
- Around line 140-142: The AWS KMS startup log only notes the signer type;
update the block that checks nodeConfig.Signer.SignerType == "awskms" to also
read the KMS key identifier from the signer config (e.g.,
nodeConfig.Signer.KeyID or nodeConfig.Signer.AwsKmsKeyID), truncate it for
safety (for example to the first 8 characters) and include that truncated key id
in the logger.Info() message so operators can see which KMS key was used while
avoiding full secret exposure.

In `@types/utils_test.go`:
- Line 82: Replace the call to context.Background() with the test's cancellable
context by using t.Context() when invoking types.GetFirstSignedHeader; update
the specific invocation firstSignedHeader, err :=
types.GetFirstSignedHeader(context.Background(), noopSigner, tc.chainID) to pass
t.Context() instead so the test uses the per-test context and cancels
automatically when the test finishes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a074a34e-fd54-4bb5-acf1-846e6f9e5871

📥 Commits

Reviewing files that changed from the base of the PR and between d27d099 and 5363cdb.

⛔ Files ignored due to path filters (6)
  • apps/evm/go.sum is excluded by !**/*.sum
  • apps/grpc/go.sum is excluded by !**/*.sum
  • apps/testapp/go.sum is excluded by !**/*.sum
  • execution/evm/test/go.sum is excluded by !**/*.sum
  • go.sum is excluded by !**/*.sum
  • test/e2e/go.sum is excluded by !**/*.sum
📒 Files selected for processing (45)
  • CHANGELOG.md
  • apps/evm/cmd/init.go
  • apps/evm/go.mod
  • apps/grpc/cmd/init.go
  • apps/grpc/go.mod
  • apps/testapp/cmd/init.go
  • apps/testapp/go.mod
  • block/internal/executing/executor.go
  • block/internal/submitting/da_submitter.go
  • block/internal/submitting/da_submitter_integration_test.go
  • block/internal/submitting/da_submitter_test.go
  • block/internal/submitting/submitter_test.go
  • block/internal/syncing/da_retriever_strict_test.go
  • block/internal/syncing/da_retriever_test.go
  • block/internal/syncing/p2p_handler_test.go
  • block/internal/syncing/syncer_test.go
  • docs/.vitepress/config.ts
  • docs/guides/operations/aws-kms-signer.md
  • go.mod
  • pkg/cmd/init.go
  • pkg/cmd/init_test.go
  • pkg/cmd/run_node.go
  • pkg/config/config.go
  • pkg/config/config_test.go
  • pkg/config/defaults.go
  • pkg/signer/aws/README.md
  • pkg/signer/aws/signer.go
  • pkg/signer/aws/signer_test.go
  • pkg/signer/factory.go
  • pkg/signer/factory_test.go
  • pkg/signer/file/README.md
  • pkg/signer/file/doc.go
  • pkg/signer/file/example_test.go
  • pkg/signer/file/file_signer_test.go
  • pkg/signer/file/local.go
  • pkg/signer/noop/signer.go
  • pkg/signer/noop/signer_test.go
  • pkg/signer/signer.go
  • pkg/store/store_adapter_test.go
  • pkg/sync/sync_service_test.go
  • test/e2e/evm_aws_kms_e2e_test.go
  • test/e2e/go.mod
  • types/signed_header_test.go
  • types/utils.go
  • types/utils_test.go

Comment on lines +3 to +15
This package implements `signer.Signer` using AWS KMS.

It uses KMS for `Sign` operations and caches the public key/address in memory after initialization.

## Requirements

- AWS credentials must be available via the standard AWS SDK credential chain.
- The configured KMS key must be an asymmetric **Ed25519** key.

## Configuration

Set `evnode.signer.signer_type` to `awskms` and provide at least `kms_key_id`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cd pkg/signer/aws && find . -type f -name "*.md" -o -name "*.go" | head -20

Repository: evstack/ev-node

Length of output: 99


🏁 Script executed:

cat -n pkg/signer/aws/README.md

Repository: evstack/ev-node

Length of output: 1086


🏁 Script executed:

cd pkg/signer/aws && ls -la

Repository: evstack/ev-node

Length of output: 352


🏁 Script executed:

cd pkg/signer/aws && wc -l *.go | tail -1

Repository: evstack/ev-node

Length of output: 70


🏁 Script executed:

cat -n pkg/signer/aws/signer.go

Repository: evstack/ev-node

Length of output: 8302


🏁 Script executed:

cat -n pkg/signer/aws/signer_test.go | head -100

Repository: evstack/ev-node

Length of output: 3771


🌐 Web query:

AWS KMS Ed25519 key creation KeyUsage SIGN_VERIFY KeySpec ECC_NIST_EDWARDS25519

💡 Result:

Yes, AWS KMS supports creating an asymmetric key with KeySpec 'ECC_NIST_EDWARDS25519' (Ed25519 on Edwards25519 curve) and KeyUsage 'SIGN_VERIFY'. This feature was announced in November 2025 and is available as of 2026-03-19. To create such a key, use the CreateKey API or AWS CLI command: aws kms create-key \ --key-spec ECC_NIST_EDWARDS25519 \ --key-usage SIGN_VERIFY This creates a customer-managed KMS key for signing and verification using EdDSA (Ed25519). The key supports signing algorithms like ED25519_SHA_512 (requires MessageType: RAW) and ED25519_PH_SHA_512 (requires MessageType: DIGEST). It cannot be used for encryption/decryption or key agreement. KeySpec 'ECC_NIST_EDWARDS25519' is listed in official AWS KMS documentation for asymmetric NIST-standard elliptic curve keys, valid only for SIGN_VERIFY usage.

Citations:


Add KMS permission and key configuration requirements to the README.

The README should explicitly document the IAM permissions and KMS key settings required for this signer to function. Without these details, users following the setup instructions will encounter failures during signer initialization.

Add to the Requirements section:

  • IAM permissions: kms:GetPublicKey and kms:Sign on the specified KMS key
  • KMS key configuration: created with KeyUsage=SIGN_VERIFY and KeySpec=ECC_NIST_EDWARDS25519

The code calls GetPublicKey eagerly during initialization (failing fast on misconfiguration) and uses Sign for all signing operations, both of which require these permissions and key settings to succeed. Users without these configured will see initialization errors.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/signer/aws/README.md` around lines 3 - 15, Update the README for the
awskms signer to add explicit IAM permissions and KMS key configuration
required: state that the IAM principal needs kms:GetPublicKey and kms:Sign on
the configured key, and that the KMS key must be created with
KeyUsage=SIGN_VERIFY and KeySpec=ECC_NIST_EDWARDS25519; mention that the
implementation of signer.Signer calls GetPublicKey eagerly during initialization
(and fails fast on misconfiguration) and uses Sign for signing operations so
those permissions/settings are mandatory for successful initialization and
runtime signing.

Comment on lines +31 to +37
// Options configures optional KmsSigner behaviour.
type Options struct {
// Timeout for individual KMS Sign API calls. Default: 10s.
Timeout time.Duration
// MaxRetries for transient KMS failures during Sign. Default: 3.
MaxRetries int
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Align the implementation with the documented default timeout.

Options says the default sign timeout is 10s, but kmsSignerFromClient falls back to 1s when opts is nil or Timeout <= 0. That makes the exported constructor much more aggressive than advertised.

Proposed fix
-	o := Options{Timeout: 1 * time.Second, MaxRetries: 3}
+	o := Options{Timeout: 10 * time.Second, MaxRetries: 3}

Also applies to: 87-95

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/signer/aws/signer.go` around lines 31 - 37, The documented default
Timeout in the Options struct (10s) is inconsistent with the fallback used in
kmsSignerFromClient (1s); update the implementation to use a single default
(e.g., defaultTimeout = 10 * time.Second) and ensure kmsSignerFromClient and any
other fallback logic use that constant when opts is nil or opts.Timeout <= 0;
modify the fallback for MaxRetries similarly if needed and verify all uses
(notably kmsSignerFromClient and the related sign/constructor paths) reference
the same default constants instead of hardcoded values.

Comment on lines +33 to +35
// Resolve signer path; allow absolute, relative to node root, or relative to CWD if resolution fails
signerPath, err := filepath.Abs(strings.TrimSuffix(config.Signer.SignerPath, "signer.json"))
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject an empty signer_path before calling filepath.Abs.

filepath.Abs("") resolves to the process working directory, so a misconfigured file signer can silently create/load ./signer.json from wherever the node was started. That's a risky fallback for key material.

Proposed fix
-		signerPath, err := filepath.Abs(strings.TrimSuffix(config.Signer.SignerPath, "signer.json"))
+		signerPath := strings.TrimSpace(config.Signer.SignerPath)
+		if signerPath == "" {
+			return nil, fmt.Errorf("signer_path is required when using local file signer")
+		}
+		signerPath = strings.TrimSuffix(signerPath, "signer.json")
+		if signerPath == "" {
+			signerPath = "."
+		}
+		signerPath, err := filepath.Abs(filepath.Clean(signerPath))
 		if err != nil {
 			return nil, err
 		}
As per coding guidelines, `**/*.go`: "Validate all inputs from external sources".
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Resolve signer path; allow absolute, relative to node root, or relative to CWD if resolution fails
signerPath, err := filepath.Abs(strings.TrimSuffix(config.Signer.SignerPath, "signer.json"))
if err != nil {
// Resolve signer path; allow absolute, relative to node root, or relative to CWD if resolution fails
signerPath := strings.TrimSpace(config.Signer.SignerPath)
if signerPath == "" {
return nil, fmt.Errorf("signer_path is required when using local file signer")
}
signerPath = strings.TrimSuffix(signerPath, "signer.json")
if signerPath == "" {
signerPath = "."
}
signerPath, err := filepath.Abs(filepath.Clean(signerPath))
if err != nil {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/signer/factory.go` around lines 33 - 35, Reject empty signer_path in the
signer factory before calling filepath.Abs: retrieve the raw value used
(config.Signer.SignerPath or the trimmed value from
strings.TrimSuffix(config.Signer.SignerPath, "signer.json")), check if it is
empty (after trimming) and return a clear error instead of calling filepath.Abs;
only call filepath.Abs when the trimmed signer path is non-empty and then
continue with existing resolution logic that assigns to signerPath. Ensure the
validation happens in the same initialization area where signerPath is computed
(the code using filepath.Abs and signerPath) so misconfiguration cannot fall
back to the process CWD.

* Add remote signer GCO KMS

* Review feedback

* Minor updates
@alpe alpe changed the title feat: Add AWS KMS signer backend feat: Add KMS signer backend Mar 20, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
docs/guides/operations/aws-kms-signer.md (1)

12-33: Add an explicit least-privilege warning for the example IAM policy.

The example policy is broad (Resource: "*") and includes key-management actions; please label it as bootstrap/admin-only and recommend scoped runtime permissions (already partially covered in Line 35–38) to reduce accidental over-permissioning in production.

Also applies to: 35-38

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/guides/operations/aws-kms-signer.md` around lines 12 - 33, Add a clear
least-privilege warning above the example IAM policy JSON that currently uses
"Resource": "*" and includes key-management actions (e.g., "kms:CreateKey",
"kms:PutKeyPolicy", "kms:EnableKey", "kms:Sign"); label this policy as
bootstrap/admin-only and explicitly state it must not be used in production,
then reference and recommend the scoped runtime permissions described later (the
smaller set in the docs) as the safe alternative and suggest replacing wildcard
Resource with specific KMS key ARNs for production.
test/e2e/evm_kms_e2e_test.go (1)

32-32: Calling flag.Parse() multiple times can cause issues

flag.Parse() is called in both TestEvmSequencerWithAWSKMSSignerE2E (line 32) and TestEvmSequencerWithGCPKMSSignerE2E (line 72). When both tests run in the same test binary execution, this can cause problems as flags are already parsed. The Go testing framework already handles flag parsing before tests run.

♻️ Remove redundant flag.Parse() calls
 func TestEvmSequencerWithAWSKMSSignerE2E(t *testing.T) {
 	if testing.Short() {
 		t.Skip("skip e2e in short mode")
 	}
-	flag.Parse()

 	kmsKeyID := os.Getenv("EVNODE_E2E_AWS_KMS_KEY_ID")
 func TestEvmSequencerWithGCPKMSSignerE2E(t *testing.T) {
 	if testing.Short() {
 		t.Skip("skip e2e in short mode")
 	}
-	flag.Parse()

 	kmsKeyName := os.Getenv("EVNODE_E2E_GCP_KMS_KEY_NAME")

Also applies to: 72-72

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/evm_kms_e2e_test.go` at line 32, The tests call flag.Parse()
redundantly in TestEvmSequencerWithAWSKMSSignerE2E and
TestEvmSequencerWithGCPKMSSignerE2E which can break when the test binary parses
flags once; remove the flag.Parse() calls from both test functions (search for
TestEvmSequencerWithAWSKMSSignerE2E and TestEvmSequencerWithGCPKMSSignerE2E) so
the standard Go test harness handles flag parsing, and run go test to confirm no
other tests rely on manual flag parsing.
pkg/signer/gcp/signer.go (2)

262-272: Inconsistent mutex pattern between GetPublic and GetAddress

GetPublic manually unlocks before the nil check while GetAddress uses defer s.mu.RUnlock(). For consistency and safety, consider using defer in both methods.

♻️ Use defer for consistency
 func (s *KmsSigner) GetPublic() (crypto.PubKey, error) {
 	s.mu.RLock()
-	pubKey := s.pubKey
-	s.mu.RUnlock()
+	defer s.mu.RUnlock()
+	pubKey := s.pubKey

 	if pubKey == nil {
 		return nil, fmt.Errorf("public key not loaded")
 	}

 	return pubKey, nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/signer/gcp/signer.go` around lines 262 - 272, GetPublic currently
acquires a read lock and manually unlocks before checking pubKey; change it to
mirror GetAddress by calling s.mu.RLock() then immediately deferring
s.mu.RUnlock(), assign pubKey := s.pubKey, and perform the nil check after the
defer so the lock-unlock pattern is consistent and safe (update the GetPublic
method accordingly to use defer s.mu.RUnlock()).

194-201: Consider using defer cancel() for the per-call timeout context

The cancel() function is called immediately after the API call, but if there's a panic between context creation and the cancel call, the context could leak. Using defer is the idiomatic pattern.

♻️ Use defer for cancel()
 	callCtx, cancel := context.WithTimeout(ctx, timeout)
+	defer cancel()
 	dataCRC32C := int64(crc32.Checksum(message, castagnoliTable))
 	out, err := s.client.AsymmetricSign(callCtx, &kmspb.AsymmetricSignRequest{
 		Name:       s.keyName,
 		Data:       message,
 		DataCrc32C: wrapperspb.Int64(dataCRC32C),
 	})
-	cancel()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/signer/gcp/signer.go` around lines 194 - 201, The context cancel function
created by callCtx, cancel := context.WithTimeout(ctx, timeout) is invoked
immediately with cancel() which can leak the context on panic; after creating
callCtx and cancel, replace the immediate cancel() call with defer cancel() so
the context is always cancelled when the surrounding function returns (keep the
call to s.client.AsymmetricSign using callCtx and the DataCrc32C wrapper as-is).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/evm/go.mod`:
- Line 239: Update the grpc dependency entry to google.golang.org/grpc v1.79.3
in the go.mod (replace the existing google.golang.org/grpc v1.79.2 // indirect
line), then run go mod tidy (or go get google.golang.org/grpc@v1.79.3) to update
the lockfiles and verify module downloads; this addresses the CVE by ensuring
the project uses the patched grpc version.

In `@docs/guides/operations/aws-kms-signer.md`:
- Around line 72-75: Remove the misleading expected startup log "initialized AWS
KMS signer via factory" from the docs and either delete that sentence or replace
it with the actual log emitted by the AWS signer initialization path; update the
guide text that references the signer factory so it reflects the real behavior
of the AWS signer initialization path (see aws signer implementation in
pkg/signer/aws/signer.go and the factory behavior in pkg/signer/factory.go) so
operators are not told to expect a nonexistent log line.

In `@go.mod`:
- Line 48: The go.mod currently pins the vulnerable module
"google.golang.org/grpc v1.79.2"; update that dependency to a patched gRPC
release that fixes GHSA-p77j-4mvh-x3m3 (e.g., upgrade to a patched version >=
v1.79.3 or the latest patched release), run "go get
google.golang.org/grpc@<patched-version>" and "go mod tidy", run tests, and
ensure any other references (such as the duplicate pin in apps/evm/go.mod) are
likewise updated and no replace directives or transitive dependencies re-pin the
vulnerable v1.79.2.

---

Nitpick comments:
In `@docs/guides/operations/aws-kms-signer.md`:
- Around line 12-33: Add a clear least-privilege warning above the example IAM
policy JSON that currently uses "Resource": "*" and includes key-management
actions (e.g., "kms:CreateKey", "kms:PutKeyPolicy", "kms:EnableKey",
"kms:Sign"); label this policy as bootstrap/admin-only and explicitly state it
must not be used in production, then reference and recommend the scoped runtime
permissions described later (the smaller set in the docs) as the safe
alternative and suggest replacing wildcard Resource with specific KMS key ARNs
for production.

In `@pkg/signer/gcp/signer.go`:
- Around line 262-272: GetPublic currently acquires a read lock and manually
unlocks before checking pubKey; change it to mirror GetAddress by calling
s.mu.RLock() then immediately deferring s.mu.RUnlock(), assign pubKey :=
s.pubKey, and perform the nil check after the defer so the lock-unlock pattern
is consistent and safe (update the GetPublic method accordingly to use defer
s.mu.RUnlock()).
- Around line 194-201: The context cancel function created by callCtx, cancel :=
context.WithTimeout(ctx, timeout) is invoked immediately with cancel() which can
leak the context on panic; after creating callCtx and cancel, replace the
immediate cancel() call with defer cancel() so the context is always cancelled
when the surrounding function returns (keep the call to s.client.AsymmetricSign
using callCtx and the DataCrc32C wrapper as-is).

In `@test/e2e/evm_kms_e2e_test.go`:
- Line 32: The tests call flag.Parse() redundantly in
TestEvmSequencerWithAWSKMSSignerE2E and TestEvmSequencerWithGCPKMSSignerE2E
which can break when the test binary parses flags once; remove the flag.Parse()
calls from both test functions (search for TestEvmSequencerWithAWSKMSSignerE2E
and TestEvmSequencerWithGCPKMSSignerE2E) so the standard Go test harness handles
flag parsing, and run go test to confirm no other tests rely on manual flag
parsing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d986867e-d2c9-40c3-968d-ac865e15dbe4

📥 Commits

Reviewing files that changed from the base of the PR and between 5363cdb and 5006360.

⛔ Files ignored due to path filters (6)
  • apps/evm/go.sum is excluded by !**/*.sum
  • apps/grpc/go.sum is excluded by !**/*.sum
  • apps/testapp/go.sum is excluded by !**/*.sum
  • execution/evm/test/go.sum is excluded by !**/*.sum
  • go.sum is excluded by !**/*.sum
  • test/e2e/go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • apps/evm/go.mod
  • apps/grpc/go.mod
  • apps/testapp/go.mod
  • docs/guides/operations/aws-kms-signer.md
  • docs/guides/operations/gcp-kms-signer.md
  • execution/evm/test/go.mod
  • go.mod
  • pkg/cmd/run_node.go
  • pkg/config/config.go
  • pkg/config/config_test.go
  • pkg/config/defaults.go
  • pkg/signer/aws/README.md
  • pkg/signer/factory.go
  • pkg/signer/factory_test.go
  • pkg/signer/gcp/README.md
  • pkg/signer/gcp/signer.go
  • pkg/signer/gcp/signer_test.go
  • test/e2e/evm_kms_e2e_test.go
  • test/e2e/go.mod
✅ Files skipped from review due to trivial changes (5)
  • execution/evm/test/go.mod
  • docs/guides/operations/gcp-kms-signer.md
  • pkg/signer/gcp/README.md
  • pkg/signer/aws/README.md
  • pkg/signer/factory_test.go
🚧 Files skipped from review as they are similar to previous changes (8)
  • pkg/config/config_test.go
  • pkg/config/defaults.go
  • apps/grpc/go.mod
  • test/e2e/go.mod
  • apps/testapp/go.mod
  • pkg/signer/factory.go
  • pkg/cmd/run_node.go
  • pkg/config/config.go

Comment on lines +72 to +75
You should see a startup log line:

`initialized AWS KMS signer via factory`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove or replace the expected startup log line; it does not exist in current code paths.

Line 72–75 tells operators to expect initialized AWS KMS signer via factory, but the AWS signer initialization path and signer factory path do not emit that log (pkg/signer/aws/signer.go:52-110, pkg/signer/factory.go:50-77). This creates a false negative during bring-up.

Proposed doc fix
-You should see a startup log line:
-
-`initialized AWS KMS signer via factory`
+Confirm startup succeeded by checking that `evnode` starts without signer initialization errors.
+If startup fails, use the exact errors in the troubleshooting section below.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
You should see a startup log line:
`initialized AWS KMS signer via factory`
Confirm startup succeeded by checking that `evnode` starts without signer initialization errors.
If startup fails, use the exact errors in the troubleshooting section below.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/guides/operations/aws-kms-signer.md` around lines 72 - 75, Remove the
misleading expected startup log "initialized AWS KMS signer via factory" from
the docs and either delete that sentence or replace it with the actual log
emitted by the AWS signer initialization path; update the guide text that
references the signer factory so it reflects the real behavior of the AWS signer
initialization path (see aws signer implementation in pkg/signer/aws/signer.go
and the factory behavior in pkg/signer/factory.go) so operators are not told to
expect a nonexistent log line.

alpe added 2 commits March 23, 2026 10:07
* main:
  build(deps): Bump rustls-webpki from 0.103.8 to 0.103.10 in the cargo group across 1 directory (#3184)
  refactor: replace context.WithCancel with t.Context (#3183)
  build(deps): Bump google.golang.org/grpc from 1.79.2 to 1.79.3 (#3181)
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
pkg/config/config.go (1)

315-315: ⚠️ Potential issue | 🟡 Minor

Minor inconsistency in signer type documentation persists.

The struct field comment on line 315 lists (file, kms) but the flag description on line 622 lists (file, grpc, kms). These should be consistent. The previous review flagged a similar inconsistency that was marked as addressed.

📝 Proposed fix

Either update the struct comment to include grpc if it's a valid type:

-	SignerType string          `mapstructure:"signer_type" yaml:"signer_type" comment:"Type of remote signer to use (file, kms)"`
+	SignerType string          `mapstructure:"signer_type" yaml:"signer_type" comment:"Type of remote signer to use (file, grpc, kms)"`

Or update the flag description to remove grpc if it's not supported:

-	cmd.Flags().String(FlagSignerType, def.Signer.SignerType, "type of signer to use (file, grpc, kms)")
+	cmd.Flags().String(FlagSignerType, def.Signer.SignerType, "type of signer to use (file, kms)")

,

Also applies to: 622-622

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/config/config.go` at line 315, The SignerType struct field comment on
SignerType (`SignerType string  'mapstructure:"signer_type" yaml:"signer_type"
comment:"Type of remote signer to use (file, kms)"'`) is inconsistent with the
flag description that lists `(file, grpc, kms)`; make them consistent by either
adding `grpc` to the struct field comment or removing `grpc` from the flag
description (whichever reflects actual supported signer types), and update both
the SignerType field comment and the corresponding flag description text so both
list the same valid signer types.
pkg/signer/aws/signer.go (1)

31-37: ⚠️ Potential issue | 🟡 Minor

Align the implementation with the documented default timeout.

Options documents the default sign timeout as 10s (line 33), but kmsSignerFromClient falls back to 1s when opts is nil or Timeout <= 0 (line 87). This makes the behavior more aggressive than advertised.

📝 Proposed fix
-	o := Options{Timeout: 1 * time.Second, MaxRetries: 3}
+	o := Options{Timeout: 10 * time.Second, MaxRetries: 3}

,

Also applies to: 87-95

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/signer/aws/signer.go` around lines 31 - 37, The Options struct documents
a default Timeout of 10s but kmsSignerFromClient currently falls back to 1s when
opts is nil or Timeout <= 0; update kmsSignerFromClient to use 10*time.Second as
the fallback/default for Options.Timeout (and keep existing handling for opts ==
nil and Timeout <= 0), referencing the Options type and the kmsSignerFromClient
function so the behavior matches the documented default; also scan the same
initialization logic in that function (the block handling opts.Timeout and
opts.MaxRetries) to ensure the MaxRetries default remains unchanged or is
explicitly set if needed.
🧹 Nitpick comments (1)
pkg/signer/aws/signer.go (1)

164-172: Unbounded exponential backoff may cause excessive delays.

The backoff calculation 100<<uint(attempt-1) grows without a cap. For MaxRetries=3 (4 attempts), the delays are 100ms, 200ms, 400ms which is fine. However, if someone configures a higher MaxRetries, attempt 10 would compute ~51 seconds of backoff.

The GCP signer implementation (lines 225-243) includes a maxRetryBackoff = 5 * time.Second cap. Consider adding the same safeguard here for consistency.

📝 Proposed fix
+const maxRetryBackoff = 5 * time.Second
+
 func (s *KmsSigner) Sign(ctx context.Context, message []byte) ([]byte, error) {
 	...
 	for attempt := 0; attempt < maxAttempts; attempt++ {
 		...
 		if attempt > 0 {
-			// Exponential backoff: 100ms, 200ms, 400ms, ...
-			backoff := time.Duration(100<<uint(attempt-1)) * time.Millisecond
+			// Exponential backoff with cap: 100ms, 200ms, 400ms, ... up to 5s.
+			backoff := time.Duration(100<<uint(attempt-1)) * time.Millisecond
+			if backoff > maxRetryBackoff {
+				backoff = maxRetryBackoff
+			}
 			select {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/signer/aws/signer.go` around lines 164 - 172, The exponential backoff in
the KMS signing retry loop (where attempt > 0 and backoff is computed as
time.Duration(100<<uint(attempt-1))*time.Millisecond) is unbounded and can
produce excessive delays for large MaxRetries; update the retry backoff to cap
the computed value at a reasonable maximum (e.g. maxRetryBackoff = 5 *
time.Second) before using time.After, so compute backoff, if backoff >
maxRetryBackoff set backoff = maxRetryBackoff, then use the capped backoff in
the select along with the existing ctx cancellation handling in the Sign/KMS
retry loop.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/signer/gcp/signer.go`:
- Around line 53-66: The documented default timeout for Options is 10s but
kmsSignerFromClient currently falls back to 1s; update the logic so the
effective timeout matches the doc: make Options.timeout() return 10*time.Second
when o is nil or o.Timeout <= 0 (or otherwise ensure kmsSignerFromClient uses
that 10s default), and keep maxRetries behavior consistent (use
Options.maxRetries() or a similar default if nil). Update only the timeout
fallback (change 1s -> 10s) by modifying the timeout() helper or the fallback in
kmsSignerFromClient to use 10*time.Second.

---

Duplicate comments:
In `@pkg/config/config.go`:
- Line 315: The SignerType struct field comment on SignerType (`SignerType
string  'mapstructure:"signer_type" yaml:"signer_type" comment:"Type of remote
signer to use (file, kms)"'`) is inconsistent with the flag description that
lists `(file, grpc, kms)`; make them consistent by either adding `grpc` to the
struct field comment or removing `grpc` from the flag description (whichever
reflects actual supported signer types), and update both the SignerType field
comment and the corresponding flag description text so both list the same valid
signer types.

In `@pkg/signer/aws/signer.go`:
- Around line 31-37: The Options struct documents a default Timeout of 10s but
kmsSignerFromClient currently falls back to 1s when opts is nil or Timeout <= 0;
update kmsSignerFromClient to use 10*time.Second as the fallback/default for
Options.Timeout (and keep existing handling for opts == nil and Timeout <= 0),
referencing the Options type and the kmsSignerFromClient function so the
behavior matches the documented default; also scan the same initialization logic
in that function (the block handling opts.Timeout and opts.MaxRetries) to ensure
the MaxRetries default remains unchanged or is explicitly set if needed.

---

Nitpick comments:
In `@pkg/signer/aws/signer.go`:
- Around line 164-172: The exponential backoff in the KMS signing retry loop
(where attempt > 0 and backoff is computed as
time.Duration(100<<uint(attempt-1))*time.Millisecond) is unbounded and can
produce excessive delays for large MaxRetries; update the retry backoff to cap
the computed value at a reasonable maximum (e.g. maxRetryBackoff = 5 *
time.Second) before using time.After, so compute backoff, if backoff >
maxRetryBackoff set backoff = maxRetryBackoff, then use the capped backoff in
the select along with the existing ctx cancellation handling in the Sign/KMS
retry loop.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c27ce2dd-c0f3-4b55-8419-0c29ec16f3b3

📥 Commits

Reviewing files that changed from the base of the PR and between d575881 and b4646ea.

📒 Files selected for processing (9)
  • docs/guides/operations/aws-kms-signer.md
  • pkg/config/config.go
  • pkg/signer/aws/signer.go
  • pkg/signer/aws/signer_test.go
  • pkg/signer/factory.go
  • pkg/signer/gcp/signer.go
  • test/e2e/evm_kms_e2e_test.go
  • types/utils.go
  • types/utils_test.go
✅ Files skipped from review due to trivial changes (1)
  • docs/guides/operations/aws-kms-signer.md
🚧 Files skipped from review as they are similar to previous changes (3)
  • types/utils_test.go
  • pkg/signer/aws/signer_test.go
  • pkg/signer/factory.go

Comment on lines +53 to +66
// Options configures optional KmsSigner behavior.
type Options struct {
// CredentialsFile is an optional path to a Google credentials JSON file.
// If empty, Application Default Credentials are used.
CredentialsFile string
// Timeout for individual KMS Sign API calls. Default: 10s.
Timeout time.Duration
// MaxRetries for transient KMS failures during Sign. Default: 3.
MaxRetries int
}

func (o *Options) timeout() time.Duration { return o.Timeout }

func (o *Options) maxRetries() int { return o.MaxRetries }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Align the implementation with the documented default timeout.

Options documents the default sign timeout as 10s (line 58), but kmsSignerFromClient falls back to 1s when opts is nil or Timeout <= 0 (line 108). This makes the behavior more aggressive than advertised.

📝 Proposed fix
-	o := Options{Timeout: 1 * time.Second, MaxRetries: 3}
+	o := Options{Timeout: 10 * time.Second, MaxRetries: 3}

Also applies to: 108-108

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/signer/gcp/signer.go` around lines 53 - 66, The documented default
timeout for Options is 10s but kmsSignerFromClient currently falls back to 1s;
update the logic so the effective timeout matches the doc: make
Options.timeout() return 10*time.Second when o is nil or o.Timeout <= 0 (or
otherwise ensure kmsSignerFromClient uses that 10s default), and keep maxRetries
behavior consistent (use Options.maxRetries() or a similar default if nil).
Update only the timeout fallback (change 1s -> 10s) by modifying the timeout()
helper or the fallback in kmsSignerFromClient to use 10*time.Second.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
pkg/signer/gcp/signer.go (1)

36-50: Expose a shutdown path for the KMS client.

kms.NewKeyManagementClient returns a client with Close(), and the package docs say callers should close it when the client is no longer needed. NewKmsSigner owns that client today, but KMSClient/KmsSigner do not expose any close semantics, so these gRPC connections can only be reclaimed by process exit. (pkg.go.dev)

♻️ Suggested direction
 type KMSClient interface {
 	AsymmetricSign(ctx context.Context, req *kmspb.AsymmetricSignRequest) (*kmspb.AsymmetricSignResponse, error)
 	GetPublicKey(ctx context.Context, req *kmspb.GetPublicKeyRequest) (*kmspb.PublicKey, error)
+	Close() error
 }
@@
 func (c *cloudKMSClient) GetPublicKey(ctx context.Context, req *kmspb.GetPublicKeyRequest) (*kmspb.PublicKey, error) {
 	return c.client.GetPublicKey(ctx, req)
 }
+
+func (c *cloudKMSClient) Close() error {
+	return c.client.Close()
+}
@@
 type KmsSigner struct {
 	client  KMSClient
 	keyName string
 	opts    Options
 	mu      sync.RWMutex
 	pubKey  crypto.PubKey
 	address []byte
 }
+
+func (s *KmsSigner) Close() error {
+	if s.client == nil {
+		return nil
+	}
+	return s.client.Close()
+}

Then wire Close() into the node shutdown path where the signer is owned.

Also applies to: 80-95

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/signer/gcp/signer.go` around lines 36 - 50, The KMS client created by
NewKmsSigner isn't being closed; add a Close method to the KMSClient interface
and implement it on cloudKMSClient to call the underlying
kms.KeyManagementClient.Close(); then add a Close method on KmsSigner that
forwards to its KMSClient.Close (or type-asserts to the interface) so callers of
NewKmsSigner can cleanly shutdown the gRPC connection from the node shutdown
path; update NewKmsSigner users to call KmsSigner.Close when the signer is
disposed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/signer/gcp/signer.go`:
- Around line 133-147: The fetchPublicKey and verifySignResponse paths must
validate KMS response names and CRC32C integrity fields: in fetchPublicKey
(function fetchPublicKey, call s.client.GetPublicKey and examine the returned
out) verify out.GetName() exactly matches s.keyName and is non-empty, compute
the CRC32C (Castagnoli) of out.GetPem() and compare it to the returned
out.GetPemCrc32C (or equivalent CRC field) and return an error on mismatch
before parsing the PEM; similarly, in verifySignResponse (the handler that
consumes the SignResponse/VerifyResponse), validate response.GetName() matches
s.keyName, compute CRC32C of the data that was signed and compare to
response.GetVerifiedDataCrc32C and compute CRC32C of response.GetSignature() and
compare to response.GetSignatureCrc32C (or their protobuf wrappers), returning
an error if any mismatch or missing fields are detected so corrupted or
mismatched KMS responses are rejected and never cached/accepted.

---

Nitpick comments:
In `@pkg/signer/gcp/signer.go`:
- Around line 36-50: The KMS client created by NewKmsSigner isn't being closed;
add a Close method to the KMSClient interface and implement it on cloudKMSClient
to call the underlying kms.KeyManagementClient.Close(); then add a Close method
on KmsSigner that forwards to its KMSClient.Close (or type-asserts to the
interface) so callers of NewKmsSigner can cleanly shutdown the gRPC connection
from the node shutdown path; update NewKmsSigner users to call KmsSigner.Close
when the signer is disposed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 649390a9-8b6d-43a8-b65a-997d2e56b437

📥 Commits

Reviewing files that changed from the base of the PR and between b4646ea and aedf192.

⛔ Files ignored due to path filters (1)
  • execution/evm/test/go.sum is excluded by !**/*.sum
📒 Files selected for processing (3)
  • docs/guides/operations/gcp-kms-signer.md
  • execution/evm/test/go.mod
  • pkg/signer/gcp/signer.go
✅ Files skipped from review due to trivial changes (2)
  • execution/evm/test/go.mod
  • docs/guides/operations/gcp-kms-signer.md

Comment on lines +133 to +147
func (s *KmsSigner) fetchPublicKey(ctx context.Context) error {
out, err := s.client.GetPublicKey(ctx, &kmspb.GetPublicKeyRequest{Name: s.keyName})
if err != nil {
return fmt.Errorf("KMS GetPublicKey failed: %w", err)
}

block, _ := pem.Decode([]byte(out.GetPem()))
if block == nil {
return fmt.Errorf("failed to decode PEM public key")
}

pub, err := x509.ParsePKIXPublicKey(block.Bytes)
if err != nil {
return fmt.Errorf("failed to parse KMS public key: %w", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

git ls-files | grep -E "signer|kms" | head -20

Repository: evstack/ev-node

Length of output: 640


🏁 Script executed:

fd "signer.go" --type f

Repository: evstack/ev-node

Length of output: 171


🏁 Script executed:

wc -l pkg/signer/gcp/signer.go 2>/dev/null || echo "File not found, searching..."

Repository: evstack/ev-node

Length of output: 87


🏁 Script executed:

find . -name "signer.go" -path "*/gcp/*" 2>/dev/null

Repository: evstack/ev-node

Length of output: 85


🏁 Script executed:

cat -n pkg/signer/gcp/signer.go

Repository: evstack/ev-node

Length of output: 10499


🏁 Script executed:

rg "kmspb\." pkg/signer/gcp/signer.go | head -20

Repository: evstack/ev-node

Length of output: 745


🏁 Script executed:

cat -n pkg/signer/gcp/signer_test.go | head -100

Repository: evstack/ev-node

Length of output: 3973


🏁 Script executed:

rg "GetName|GetPemCrc32C" pkg/signer/gcp/ 2>/dev/null || echo "Not found in gcp package"

Repository: evstack/ev-node

Length of output: 83


🌐 Web query:

Google Cloud KMS data integrity guidelines GetPublicKey response validation

💡 Result:

Google Cloud KMS provides data integrity guidelines for the GetPublicKey response through CRC32C checksum verification and resource name matching. These are detailed in the official documentation at https://cloud.google.com/kms/docs/data-integrity-guidelines and the API reference. Key validation steps for the PublicKey response: 1. Verify the 'name' field matches the requested CryptoKeyVersion resource name (e.g., projects//locations//keyRings//cryptoKeys//cryptoKeyVersions/*). Mismatch indicates request corruption in-transit. 2. Compute the CRC32C checksum of the 'pem' field (for non-PQC keys) and compare it to 'pemCrc32c.value'. For specified publicKeyFormat (PQC/X-Wing), verify CRC32C of 'publicKey.data' against 'publicKey.crc32c_checksum.value'. If checksums mismatch, discard the response, retry a limited number of times. Persistent mismatch may indicate client-side CRC32C computation error. Code examples from docs (Go, Node.js, Python) demonstrate this: Go: crc32c := func(data []byte) uint32 { t := crc32.MakeTable(crc32.Castagnoli) return crc32.Checksum(data, t) } if int64(crc32c([]byte(result.Pem))) != result.PemCrc32c.Value { return fmt.Errorf("getPublicKey: response corrupted in-transit") } Node.js: const crc32c = require('fast-crc32c'); if (crc32c.calculate(publicKey.pem) !== Number(publicKey.pemCrc32c.value)) { throw new Error('GetPublicKey: response corrupted in-transit'); } Python: if not public_key.pem_crc32c == crc32c(public_key.pem.encode("utf-8")): raise Exception("The response received from the server was corrupted in-transit.") These checks ensure end-to-end integrity against client-side corruption or transit issues, supplementing Google Cloud's automatic protections.

Citations:


🏁 Script executed:

grep -n "GetName\|GetPemCrc32C\|GetVerifiedDataCrc32C\|GetSignatureCrc32C" pkg/signer/gcp/signer.go

Repository: evstack/ev-node

Length of output: 146


🏁 Script executed:

rg "type PublicKey\|type AsymmetricSignResponse" --type go | head -20

Repository: evstack/ev-node

Length of output: 41


🏁 Script executed:

rg -A 5 "GetSignatureCrc32C\(\)" pkg/signer/gcp/signer.go

Repository: evstack/ev-node

Length of output: 253


🏁 Script executed:

python3 << 'EOF'
# Verify the proposed fix logic makes sense
# The code shows signature_crc32c is validated:
# signatureCRC32C := out.GetSignatureCrc32C()
# if signatureCRC32C == nil { return error }
# if signatureCRC32C.GetValue() != expectedCRC32C { return error }

# The proposed fix follows the exact same pattern for pem_crc32c:
# if out.GetPemCrc32C() == nil { return error }
# if got, want := out.GetPemCrc32C().GetValue(), int64(crc32.Checksum(...)...)
# if got != want { return error }

# This is consistent pattern - the fix is sound.
print("Pattern validation: CONSISTENT - proposed fix mirrors existing signature_crc32c validation")
print("\nGoogle Cloud KMS documentation confirms:")
print("1. Name field validation is required for GetPublicKey and AsymmetricSign")
print("2. pem_crc32c validation is required for GetPublicKey response")
print("3. These prevent in-transit corruption and request/response mismatches")
EOF

Repository: evstack/ev-node

Length of output: 398


Add integrity checks for Cloud KMS response fields.

The fetchPublicKey and verifySignResponse functions must validate the name field (to detect request/response mismatch) and the CRC32C checksum fields as required by Google Cloud KMS data integrity guidelines. Currently, corrupted or mismatched responses can be cached as this node's identity or returned as valid signatures without detection.

🛡️ Proposed fix
 func (s *KmsSigner) fetchPublicKey(ctx context.Context) error {
 	out, err := s.client.GetPublicKey(ctx, &kmspb.GetPublicKeyRequest{Name: s.keyName})
 	if err != nil {
 		return fmt.Errorf("KMS GetPublicKey failed: %w", err)
 	}
+	if out.GetName() != s.keyName {
+		return fmt.Errorf("KMS GetPublicKey integrity check failed: unexpected key name %q", out.GetName())
+	}
+	if out.GetPemCrc32C() == nil {
+		return fmt.Errorf("KMS GetPublicKey integrity check failed: pem_crc32c is missing")
+	}
+	if got, want := out.GetPemCrc32C().GetValue(), int64(crc32.Checksum([]byte(out.GetPem()), castagnoliTable)); got != want {
+		return fmt.Errorf("KMS GetPublicKey integrity check failed: pem_crc32c mismatch")
+	}
 
 	block, _ := pem.Decode([]byte(out.GetPem()))
@@
-		if err := verifySignResponse(out); err != nil {
+		if err := verifySignResponse(out, s.keyName); err != nil {
 			lastErr = err
 			continue
 		}
@@
-func verifySignResponse(out *kmspb.AsymmetricSignResponse) error {
+func verifySignResponse(out *kmspb.AsymmetricSignResponse, expectedName string) error {
+	if out.GetName() != expectedName {
+		return fmt.Errorf("KMS Sign integrity check failed: unexpected key name %q", out.GetName())
+	}
 	if !out.GetVerifiedDataCrc32C() {
 		return fmt.Errorf("KMS Sign integrity check failed: verified_data_crc32c is false")
 	}

Per coding guideline: "Validate all inputs from external sources in Go code". See https://cloud.google.com/kms/docs/data-integrity-guidelines.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/signer/gcp/signer.go` around lines 133 - 147, The fetchPublicKey and
verifySignResponse paths must validate KMS response names and CRC32C integrity
fields: in fetchPublicKey (function fetchPublicKey, call s.client.GetPublicKey
and examine the returned out) verify out.GetName() exactly matches s.keyName and
is non-empty, compute the CRC32C (Castagnoli) of out.GetPem() and compare it to
the returned out.GetPemCrc32C (or equivalent CRC field) and return an error on
mismatch before parsing the PEM; similarly, in verifySignResponse (the handler
that consumes the SignResponse/VerifyResponse), validate response.GetName()
matches s.keyName, compute CRC32C of the data that was signed and compare to
response.GetVerifiedDataCrc32C and compute CRC32C of response.GetSignature() and
compare to response.GetSignatureCrc32C (or their protobuf wrappers), returning
an error if any mismatch or missing fields are detected so corrupted or
mismatched KMS responses are rejected and never cached/accepted.

@alpe alpe requested a review from julienrbrt March 23, 2026 13:21
@github-actions
Copy link
Contributor

PR Preview Action v1.8.1

QR code for preview link

🚀 View preview at
https://evstack.github.io/docs-preview/pr-3171/

Built to branch main at 2026-03-23 16:32 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Add KMS signer backend

1 participant