Add opt-in RBAC management UI via rbacManagement flag#4865
Add opt-in RBAC management UI via rbacManagement flag#4865dimitri-nicolo wants to merge 4 commits into
Conversation
504568c to
8239694
Compare
| // that need a tenant-scoped Manager (the manager controller itself) must use | ||
| // GetManager from pkg/controller/manager. Non-manager controllers reading | ||
| // zero-tenant-only flags (e.g. spec.rbacManagement) belong here. | ||
| func GetZeroTenantManagerOrNil(ctx context.Context, c client.Client) (*operatorv1.Manager, error) { |
There was a problem hiding this comment.
Can't we move the GetManager func from manager_controller to the utils package and re-use it? There is no reason that apiserver should throw an error if manager is not present. We should however only fetch it when enterprise CRDs exist and we are not running in cloud.
There was a problem hiding this comment.
I see that you moved the func to here, but I also meant to delete this func entirely as well.
There was a problem hiding this comment.
I think the code is more clear if we delete this method, remove any notion of "zero tenant" from the code including comments (it's not yargon that is used today) and use the multitenant booleans explicitly at the caller. I think that will be more understandable to the average reader.
| // +optional | ||
| ManagerDeployment *ManagerDeployment `json:"managerDeployment,omitempty"` | ||
|
|
||
| // RBAC configures the RBAC management UI feature. Only honored in |
There was a problem hiding this comment.
I don't think we should leak cloud (internal) implications into the API comments. Our users don't need to be aware/think about those.
| // Disabled. | ||
| // +optional | ||
| // +kubebuilder:validation:Enum=Enabled;Disabled | ||
| Mode RBACMode `json:"mode,omitempty"` |
There was a problem hiding this comment.
What do you think of the following field names?
- Management: Enabled|Disabled
- UI: Enabled|Disabled
or values: Visible | Hidden
b9493a6 to
94159c6
Compare
|
|
||
| // RBAC management UI: when enabled on the Manager CR, the rbacsync | ||
| // controller runs inside calico-kube-controllers to reconcile the | ||
| // catalog of managed ClusterRoles (per-tier + 32 fine-grained + 6 |
There was a problem hiding this comment.
nit: This number is not likely to hold up over time, best to keep the comment succinct.
|
General comment: the AI generated comments in this code are a bit excessive. An extreme example may be the permissions. The func explains in comments twice which permissions are added + the code to add them. They are making the code less readable. And now we need to also keep the comments up to date when we add more. The operator code is not super complicated, I'd much rather have comments be only used if it adds critical information you couldn't get from simply reading the code (or have your ai read it for you). |
| }, | ||
| // Escalation coverage for webhook-mod resource roles (create/patch on | ||
| // Secrets to wire up webhook auth). | ||
| rbacv1.PolicyRule{ |
There was a problem hiding this comment.
Could we be more specific with secrets and configmaps and bind them to the namespaces we actually need?
There was a problem hiding this comment.
The named-resource access moved to a new Role + RoleBinding in calico-system, where tigera-idp-groups and tigera-idp-ldap-config actually live. The broad create is bound to that one namespace now.
Dropped the tigera-known-oidc-users rule (it lives in tigera-elasticsearch and is already granted by logstorage.go), and tightened the activation gate to !Tenant.MultiTenant() since the IdP resources are pinned to calico-system and the RBAC UI is zero-tenant-only anyway.
Is that what you meant by "bind them to the namespaces we actually need" — or were you pointing at something narrower?
8833288 to
2b749da
Compare
| // management UI's directory cache and the cascading cleanup of managed | ||
| // bindings when a group is removed). | ||
| { | ||
| APIGroups: []string{""}, |
There was a problem hiding this comment.
This one should be in a role.
Add Manager.spec.rbac.ui (Enabled/Disabled enum, defaults to Disabled), the RBAC/RBACUI types, and the nil-safe Manager.RBACManagementEnabled() helper. Regenerate deepcopy methods and the Manager CRD manifest. This is the opt-in toggle for the RBAC management UI feature; the render and controller wiring that consume it follow in subsequent commits. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add rbac_management.go with the escalation-prevention rule set shared by the calico-manager and calico-kube-controllers roles, and gate the feature's RBAC on Manager.spec.rbac.ui == Enabled: - manager: extend calico-manager-role with the cluster-scoped UI rules and add a calico-system Role+RoleBinding scoping Secret/ConfigMap access to the IdP resources; both gated on !MultiTenant() (the feature is zero-tenant-only). Set RBAC_UI_ENABLED on ui-apis, and allow LDAP egress when OIDC authentication is configured. - apiserver: extend tigera-network-admin with the manage-roles rule. - kube-controllers: enable the rbacsync controller, grant its rules, and add a calico-system Role+RoleBinding reading tigera-idp-groups. The Manager and Authentication CRs are surfaced through the render configs here; the controllers that populate them follow. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Move GetManager from the manager controller into pkg/controller/utils so the apiserver and installation controllers can read the Manager CR without importing the manager controller package. Behaviour is unchanged: it keys on tigera-secure, namespacing the lookup only in multi-tenant mode, and returns the raw client error so callers decide whether absence is fatal. Update the manager controller and its tests to the new location. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Read the Manager CR in the installation and apiserver controllers and pass RBACManagementEnabled into their render configs, and watch the Manager CR so toggling spec.rbac re-runs each reconcile. The installation controller only reads it for the enterprise variant; absence is treated as RBAC management disabled. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2b749da to
6af6ea3
Compare
Description
Type: New feature (Calico Enterprise only). Addresses PMREQ-824.
Adds an opt-in RBAC management UI feature, toggled by a new
Manager.spec.rbac.uifield (Enabled|Disabled, defaults toDisabled). When enabled, the operator renders the additional RBAC and network access the UI needs to let an administrator manage role/group assignments from the Manager: maintaining a catalog of managedClusterRoles, binding IdP groups to roles via therbacsynccontroller, and resolving groups from an external LDAP/AD directory.This should be merged because it is the operator-side enablement for the RBAC management UI; without it the UI's backend has no RBAC or egress to function. Gating on
spec.rbac.uikeeps the feature — and its escalation-capable permissions — entirely dormant on clusters that don't opt in.What the flag turns on, by layer
api/v1/manager_types.go) — newRBACspec struct with aUI RBACUIenum field and a nil-safeManager.RBACManagementEnabled()helper (returnsfalsefor a nil Manager or any value other thanEnabled). Regenerated deepcopy and the Manager CRD.calico-managerrole (pkg/render/manager.go): adds the cluster-scoped RBAC management UI rules (full CRUD on managed RBAC objects + escalation-prevention coverage for tier/resource roles), plus a namespacedRole+RoleBindingincalico-systemthat scopes thecreate/named-resource access to the IdPConfigMap/Secrets (tigera-idp-groups,tigera-idp-ldap-config) instead of granting it cluster-wide. SetsRBAC_UI_ENABLEDon theui-apiscontainer and opens LDAP/AD egress (389/636) when OIDC authentication is configured.calico-kube-controllers(pkg/render/kubecontrollers/kube-controllers.go): enables therbacsynccontroller and its RBAC only when the flag is set (dormant otherwise), and adds a namespacedRole+RoleBindinggrantingrbacsyncread on thetigera-idp-groupsConfigMapincalico-system.tigera-network-admin(pkg/render/apiserver.go): grants thebind/escalate-capable rule a network-admin needs to assign managed roles via the UI, also gated on the flag.pkg/render/rbac_management.goso the manager and kube-controllers roles stay aligned (the SA writing a managed role must already hold every permission it grants, or K8s rejects the write as privilege escalation).Why the installation and apiserver controllers read the Manager CR
The flag conceptually belongs to the Manager, but two of the resources it gates are not rendered by the manager controller:
rbacsynccontroller runs insidecalico-kube-controllers, rendered by the installation controller.tigera-network-adminClusterRole is rendered by the apiserver controller.So each of those controllers reads the Manager CR itself and passes a
RBACManagementEnabledbool into its render config. To support this,GetManagerwas moved out of the manager controller intopkg/controller/utilsso the apiserver and installation controllers can reuse it without importing the manager controller package. Both callers treat a missing Manager as not-found (feature off) rather than an error, and the installation controller only reads it for the enterprise variant.Both controllers also add a watch on the
ManagerCR. Without it, togglingspec.rbac.uiwould not re-trigger an installation/apiserver reconcile, so therbacsynccontroller and the network-admin rule wouldn't appear or disappear until some unrelated event re-ran those controllers.Per-cluster behavior
The feature is keyed off each cluster's own
Manager.spec.rbac.ui, so a management cluster and a managed cluster opt in independently. Calico Enterprise evaluates a user's managed-cluster actions against that managed cluster's own RBAC (Voltron proxies the request to the managed cluster's apiserver), so the network-adminbind/escalategrant and the manager-SA rules are intentionally rendered on managed clusters too — that is where a user managing per-cluster RBAC needs them. In multi-tenant management clusters the manager-side rules and namespaced Role are not rendered (gated on!MultiTenant()).Testing: unit tests added for all three render surfaces (manager, apiserver, kube-controllers) covering the enabled/disabled gate and the multi-tenant exclusion;
make gen-filesproduces no drift;go build ./...and the affectedpkg/render*/pkg/controller/apiserversuites pass.Components affected: apiserver (
tigera-network-admin), manager (calico-managerrole + namespaced Role + network policy), kube-controllers (rbacsync), Manager CRD, installation & apiserver controllers,pkg/controller/utils.Release Note
For PR author
make gen-filesmake gen-versionsFor PR reviewers
A note for code reviewers - all pull requests must have the following:
kind/bugif this is a bugfix.kind/enhancementif this is a a new feature.enterpriseif this PR applies to Calico Enterprise only.