fix: stop reconciling on standby management clusters#7
Open
jiazhiguang wants to merge 1 commit into
Open
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. 🗂️ Base branches to auto review (3)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
666f522 to
871b91a
Compare
Make standby handling cluster-aware so standby management clusters continue reconciling the global Cluster while skipping non-global business cluster resources. Use global-aware wrappers for Cluster, Machine, MachineSet, MachineDeployment, MachineHealthCheck, MachinePool, topology, and ClusterResourceSetBinding reconcilers. Leave global/no-cluster controllers such as ClusterClass, ExtensionConfig, and CRD migrator unguarded for now. Filter ClusterResourceSet target clusters in standby so only global clusters are applied. During deletion, keep the ClusterResourceSet finalizer and requeue if standby filtering skipped non-global clusters, so their ClusterResourceSetBindings can be cleaned up after failback. Use GlobalClusterName for the reserved global cluster name, and do not add standby blocking at the ClusterCache layer. Guard the Machine infrastructure providerID fallback against nil values.
871b91a to
5ffb6d4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add a standby reconciler wrapper backed by the system etcd-sync ConfigMap so controllers fail closed when standby state cannot be determined and skip reconciliation while the management cluster is acting as DR standby.
Wire the wrapper into core, topology, experimental, runtime, CRD migrator, Machine, MachineSet, MachineDeployment, MachineHealthCheck, MachinePool, ClusterResourceSet, and ClusterResourceSetBinding reconcilers, and expose the system namespace through a manager flag.
Harden Machine node cleanup by validating NodeRef UID and providerID before drain, volume detach waits, or delete operations, and use UID preconditions for Node deletion to avoid mutating replacement nodes.
Add unit coverage for standby detection/wrapping and replacement-node cleanup safeguards.