Skip to content

Add mirror command and API for selective package mirroring#40

Open
andrew wants to merge 1 commit intomainfrom
mirror-feature
Open

Add mirror command and API for selective package mirroring#40
andrew wants to merge 1 commit intomainfrom
mirror-feature

Conversation

@andrew
Copy link
Contributor

@andrew andrew commented Mar 19, 2026

Adds a proxy mirror CLI command and /api/mirror REST endpoints for pre-populating the cache from multiple input sources.

Input modes (narrowest to broadest):

  • Versioned PURLs: proxy mirror pkg:npm/lodash@4.17.21
  • Unversioned PURLs (all versions): proxy mirror pkg:npm/lodash
  • SBOM files: proxy mirror --sbom sbom.cdx.json (CycloneDX JSON/XML, SPDX JSON/tag-value)
  • Full registry: proxy mirror --registry npm (stub, returns clear not-yet-implemented error)

Architecture:

  • New internal/mirror/ package with Source interface and implementations
  • Reuses existing handler.Proxy.GetOrFetchArtifact() for fetch-and-cache
  • Bounded worker pool via errgroup with --concurrency flag (default 4)
  • --dry-run flag to preview what would be mirrored
  • Async job management for API usage with POST/GET/DELETE endpoints
  • metadata_cache database table for offline metadata serving

API endpoints:

  • POST /api/mirror - start a mirror job
  • GET /api/mirror/{id} - check job status
  • DELETE /api/mirror/{id} - cancel a running job

New dependencies:

  • github.com/CycloneDX/cyclonedx-go - CycloneDX SBOM parsing
  • github.com/spdx/tools-golang - SPDX SBOM parsing

Related to #20.

@andrew andrew force-pushed the mirror-feature branch 3 times, most recently from 88c9419 to 17a0bbd Compare March 20, 2026 08:19
Add a `proxy mirror` CLI command and `/api/mirror` API endpoints that
pre-populate the cache from various input sources: individual PURLs,
SBOM files (CycloneDX and SPDX), or full registry enumeration.

The mirror reuses the existing handler.Proxy.GetOrFetchArtifact()
pipeline so cached artifacts are identical to those fetched on demand.
A bounded worker pool controls download parallelism.

Metadata caching is opt-in via `cache_metadata: true` in config (or
PROXY_CACHE_METADATA=true). The mirror command always enables it. When
enabled, upstream metadata responses are stored for offline fallback
with ETag-based conditional revalidation.

New internal/mirror package with Source interface, PURLSource,
SBOMSource, RegistrySource, and async JobStore. New metadata_cache
database table for offline metadata serving.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant