Skip to content

feat(platform): source infra config from the app-config file (PLT-475)#402

Merged
bdchatham merged 4 commits into
mainfrom
plt475/config-file-migration
Jun 12, 2026
Merged

feat(platform): source infra config from the app-config file (PLT-475)#402
bdchatham merged 4 commits into
mainfrom
plt475/config-file-migration

Conversation

@bdchatham

Copy link
Copy Markdown
Collaborator

What

Migrates the non-networking platform.Config fields off environment variables into the mounted app-config file (SEI_CONTROLLER_CONFIGplatform.FileConfig) that #397 established for state-sync syncers. This is PR-1 of 2 — the transitional half.

How

  • platform.FileConfig gains grouped infra sections: scheduling, storage, resources, snapshot, resultExport, genesis, images (schema in docs/controller-app-config.md).
  • platform.Load resolves Config at startup file-wins-else-env: a non-empty file value wins; an absent one falls back to its historical SEI_* env var. An unset SEI_CONTROLLER_CONFIG reproduces the original all-env behavior.
  • platform.ReadFileConfig centralizes the file read/decode; the per-reconcile state-sync reader now shares it — removes the duplicate os.ReadFile/Unmarshal and preserves the fail-closed-vs-transient distinction.
  • main.go calls platform.Load instead of the inline os.Getenv block.
  • Config.Validate error messages name both the file key and the env var.

Decisions (aligned with the owner)

  • Env-fallback transitional release (not a hard cut): file wins, env fallback. A follow-up PR drops the fallback + the env vars once the sei-controller-config ConfigMap is verified populated in the live env. No flag-day.
  • Networking/gateway config excluded (SEI_GATEWAY_*, SEI_P2P_ENDPOINT_DOMAIN, SEI_NLB_TARGET_TYPE) — stays env-sourced pending its removal from the controller in PLT-451 (GitOps networking). Avoids migrate-then-delete churn.

Read model

Infra fields are read once at startup (an infra change warrants a restart); the stateSync section keeps its per-reconcile hot-reload. Same file, two read paths, by design.

Test

internal/platform/load_test.go: no-file (all-env), file-wins + env-fallback per field, configured-but-missing file (falls back, no error), malformed file (hard error — never silent env fallback), empty-path ReadFileConfig. go build, go vet, gofmt, and the platform + node-controller suites all pass.

Follow-up (PR-2, gated on ConfigMap populated + verified)

Drop the env fallback in platform.Load, remove the migrated env vars from config/manager/manager.yaml, finalize the CLAUDE.md convention to "file-authoritative."

🤖 Generated with Claude Code

Migrates the non-networking platform.Config fields off environment variables
and into the mounted app-config file (SEI_CONTROLLER_CONFIG -> FileConfig) that
#397 established for state-sync syncers.

- platform.FileConfig gains grouped infra sections: scheduling, storage,
  resources, snapshot, resultExport, genesis, images.
- platform.Load resolves Config at startup file-wins-else-env: a non-empty file
  value wins, an absent one falls back to its historical SEI_* env var. An unset
  SEI_CONTROLLER_CONFIG reproduces the original all-env behavior. This is the
  transitional half of PLT-475 (env fallback); a follow-up drops the fallback +
  the env vars once the ConfigMap is verified populated.
- platform.ReadFileConfig centralizes the file read/decode; the per-reconcile
  state-sync reader now shares it (removes the duplicate os.ReadFile/Unmarshal,
  preserves the fail-closed-vs-transient distinction).
- main.go calls platform.Load instead of the inline os.Getenv block.
- Config.Validate names both the file key and the env var per field.

Deliberately NOT migrated: networking/gateway config (SEI_GATEWAY_*,
SEI_P2P_ENDPOINT_DOMAIN, SEI_NLB_TARGET_TYPE) stays env-sourced, pending its
removal from the controller in the GitOps networking move (PLT-451) — avoids
migrate-then-delete churn.

Infra fields are read once at startup (an infra change needs a restart); the
stateSync section keeps its per-reconcile hot-reload. Schema documented in
docs/controller-app-config.md; CLAUDE.md convention updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@cursor

cursor Bot commented Jun 12, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Changes how scheduling, storage, images, and S3 buckets are resolved at controller startup; misconfiguration or bad YAML can block startup, though env fallback limits rollout risk during the transition.

Overview
Moves most platform infra knobs from startup SEI_* env reads into the GitOps-mounted app-config file (SEI_CONTROLLER_CONFIG), while keeping networking/gateway settings env-only until PLT-451.

platform.Load and ReadFileConfig resolve Config at startup: non-empty file fields win, missing infra fields still fall back to their historical env vars (transitional PLT-475). A present but invalid YAML file fails startup instead of silently using env. Infra from the file is read once per process (restart required); stateSync still re-reads the same file each reconcile via the shared reader in canonicalSyncers.

FileConfig grows grouped sections (scheduling, storage, resources, snapshot, resultExport, genesis, images); main drops the inline os.Getenv block in favor of Load + Validate. Config.Validate errors now cite both the file key and env var. New docs/controller-app-config.md documents the schema and read model; load_test.go covers file-wins, env fallback, missing file, and malformed file behavior.

Reviewed by Cursor Bugbot for commit 4c4452f. Bugbot is set up for automated code reviews on this repo. Configure here.

bdchatham and others added 3 commits June 12, 2026 15:10
- Drop PLT ticket IDs from code comments except where they mark temporary
  scaffolding (the transitional env fallback; networking config pending its
  PLT-451 removal).
- docs/controller-app-config.md: call out that infra-section edits need a pod
  restart (only stateSync hot-reloads), and note the absent sizePerf.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Standardize the env-var contract: one const block in load.go is the single
source of truth for every SEI_* name, referenced by Load, Config.Validate, and
the tests instead of scattered string literals. Validate also moves from a
map to an ordered slice keyed by those constants, so a missing-field error is
deterministic. No behavior change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bdchatham bdchatham merged commit 8d12881 into main Jun 12, 2026
5 checks passed
bdchatham added a commit that referenced this pull request Jun 13, 2026
…#403)

* feat(platform): make the app-config file authoritative (PLT-475 PR-2)

Drops the transitional env fallback established in PR-1 (#402): infra config
now comes solely from the mounted app-config file.

- platform.Load reads infra fields straight from FileConfig (no fileOrEnv);
  gateway fields stay env-sourced (pending PLT-451). The migrated env-name
  constants are removed.
- Config.Validate reports the file key for infra fields (no "or SEI_*"), and
  now also requires images.cosmosExporter (previously validated lazily at
  pod-build).
- config/manager/manager.yaml drops the migrated infra env vars; gateway +
  SEI_CONTROLLER_CONFIG remain.
- CLAUDE.md + docs/controller-app-config.md updated to "file-authoritative".
- Tests rewritten for file-authoritative behavior (file sourced, infra env
  ignored, missing field fails Validate, gateway from env).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs: scrub dead SEI_* infra env references (PR-2 cross-review)

Platform + kubernetes lenses flagged stale env references now that infra config
is file-authoritative:
- docs/controller-app-config.md: drop the per-key # SEI_* annotations and the
  "env-var fallback" framing; the file is the source.
- README.md: Platform Configuration section now points at the app-config file
  (was "reads all settings from environment variables").
- noderesource.go: kubeRBACProxy/cosmosExporter "not configured" errors name
  the file key, not the removed env var.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test: update noderesource error-substring assertions to file keys

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant