Skip to content

Add opt-in RBAC management UI via rbacManagement flag#4865

Open
dimitri-nicolo wants to merge 4 commits into
tigera:masterfrom
dimitri-nicolo:dimitri-PMREQ-824-rbac-mgmt
Open

Add opt-in RBAC management UI via rbacManagement flag#4865
dimitri-nicolo wants to merge 4 commits into
tigera:masterfrom
dimitri-nicolo:dimitri-PMREQ-824-rbac-mgmt

Conversation

@dimitri-nicolo

@dimitri-nicolo dimitri-nicolo commented May 26, 2026

Copy link
Copy Markdown
Contributor

Description

Type: New feature (Calico Enterprise only). Addresses PMREQ-824.

Adds an opt-in RBAC management UI feature, toggled by a new Manager.spec.rbac.ui field (Enabled | Disabled, defaults to Disabled). When enabled, the operator renders the additional RBAC and network access the UI needs to let an administrator manage role/group assignments from the Manager: maintaining a catalog of managed ClusterRoles, binding IdP groups to roles via the rbacsync controller, and resolving groups from an external LDAP/AD directory.

This should be merged because it is the operator-side enablement for the RBAC management UI; without it the UI's backend has no RBAC or egress to function. Gating on spec.rbac.ui keeps the feature — and its escalation-capable permissions — entirely dormant on clusters that don't opt in.

What the flag turns on, by layer

  • API (api/v1/manager_types.go) — new RBAC spec struct with a UI RBACUI enum field and a nil-safe Manager.RBACManagementEnabled() helper (returns false for a nil Manager or any value other than Enabled). Regenerated deepcopy and the Manager CRD.
  • Render
    • calico-manager role (pkg/render/manager.go): adds the cluster-scoped RBAC management UI rules (full CRUD on managed RBAC objects + escalation-prevention coverage for tier/resource roles), plus a namespaced Role+RoleBinding in calico-system that scopes the create/named-resource access to the IdP ConfigMap/Secrets (tigera-idp-groups, tigera-idp-ldap-config) instead of granting it cluster-wide. Sets RBAC_UI_ENABLED on the ui-apis container and opens LDAP/AD egress (389/636) when OIDC authentication is configured.
    • calico-kube-controllers (pkg/render/kubecontrollers/kube-controllers.go): enables the rbacsync controller and its RBAC only when the flag is set (dormant otherwise), and adds a namespaced Role+RoleBinding granting rbacsync read on the tigera-idp-groups ConfigMap in calico-system.
    • tigera-network-admin (pkg/render/apiserver.go): grants the bind/escalate-capable rule a network-admin needs to assign managed roles via the UI, also gated on the flag.
    • Shared escalation-prevention rules live in a new pkg/render/rbac_management.go so the manager and kube-controllers roles stay aligned (the SA writing a managed role must already hold every permission it grants, or K8s rejects the write as privilege escalation).

Why the installation and apiserver controllers read the Manager CR

The flag conceptually belongs to the Manager, but two of the resources it gates are not rendered by the manager controller:

  • The rbacsync controller runs inside calico-kube-controllers, rendered by the installation controller.
  • The tigera-network-admin ClusterRole is rendered by the apiserver controller.

So each of those controllers reads the Manager CR itself and passes a RBACManagementEnabled bool into its render config. To support this, GetManager was moved out of the manager controller into pkg/controller/utils so the apiserver and installation controllers can reuse it without importing the manager controller package. Both callers treat a missing Manager as not-found (feature off) rather than an error, and the installation controller only reads it for the enterprise variant.

Both controllers also add a watch on the Manager CR. Without it, toggling spec.rbac.ui would not re-trigger an installation/apiserver reconcile, so the rbacsync controller and the network-admin rule wouldn't appear or disappear until some unrelated event re-ran those controllers.

Per-cluster behavior

The feature is keyed off each cluster's own Manager.spec.rbac.ui, so a management cluster and a managed cluster opt in independently. Calico Enterprise evaluates a user's managed-cluster actions against that managed cluster's own RBAC (Voltron proxies the request to the managed cluster's apiserver), so the network-admin bind/escalate grant and the manager-SA rules are intentionally rendered on managed clusters too — that is where a user managing per-cluster RBAC needs them. In multi-tenant management clusters the manager-side rules and namespaced Role are not rendered (gated on !MultiTenant()).

Testing: unit tests added for all three render surfaces (manager, apiserver, kube-controllers) covering the enabled/disabled gate and the multi-tenant exclusion; make gen-files produces no drift; go build ./... and the affected pkg/render* / pkg/controller/apiserver suites pass.

Components affected: apiserver (tigera-network-admin), manager (calico-manager role + namespaced Role + network policy), kube-controllers (rbacsync), Manager CRD, installation & apiserver controllers, pkg/controller/utils.

Release Note

Added an opt-in `Manager.spec.rbacManagement.enabled` flag that enables the RBAC management UI. When enabled, the operator grants the additional RBAC and LDAP egress the UI requires; it defaults to disabled and is supported in zero-tenant management clusters only. Disabling the flag does not remove RBAC objects already created while it was enabled.

For PR author

  • Tests for change.
  • If changing pkg/apis/, run make gen-files
  • If changing versions, run make gen-versions

For PR reviewers

A note for code reviewers - all pull requests must have the following:

  • Milestone set according to targeted release.
  • Appropriate labels:
    • kind/bug if this is a bugfix.
    • kind/enhancement if this is a a new feature.
    • enterprise if this PR applies to Calico Enterprise only.

@marvin-tigera marvin-tigera added this to the v1.43.0 milestone May 26, 2026
@dimitri-nicolo dimitri-nicolo changed the title Dimitri pmreq 824 rbac mgmt manager: add opt-in RBAC management UI via rbacManagement flag May 26, 2026
@dimitri-nicolo dimitri-nicolo changed the title manager: add opt-in RBAC management UI via rbacManagement flag Add opt-in RBAC management UI via rbacManagement flag May 26, 2026
@dimitri-nicolo dimitri-nicolo force-pushed the dimitri-PMREQ-824-rbac-mgmt branch from 504568c to 8239694 Compare May 27, 2026 00:13
@dimitri-nicolo dimitri-nicolo marked this pull request as ready for review May 27, 2026 00:33
@dimitri-nicolo dimitri-nicolo requested a review from a team as a code owner May 27, 2026 00:33
Comment thread api/v1/manager_types.go Outdated
Comment thread api/v1/manager_types.go Outdated
Comment thread pkg/controller/utils/utils.go Outdated
// that need a tenant-scoped Manager (the manager controller itself) must use
// GetManager from pkg/controller/manager. Non-manager controllers reading
// zero-tenant-only flags (e.g. spec.rbacManagement) belong here.
func GetZeroTenantManagerOrNil(ctx context.Context, c client.Client) (*operatorv1.Manager, error) {

@rene-dekker rene-dekker May 27, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we move the GetManager func from manager_controller to the utils package and re-use it? There is no reason that apiserver should throw an error if manager is not present. We should however only fetch it when enterprise CRDs exist and we are not running in cloud.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you moved the func to here, but I also meant to delete this func entirely as well.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code is more clear if we delete this method, remove any notion of "zero tenant" from the code including comments (it's not yargon that is used today) and use the multitenant booleans explicitly at the caller. I think that will be more understandable to the average reader.

Comment thread api/v1/manager_types.go Outdated
// +optional
ManagerDeployment *ManagerDeployment `json:"managerDeployment,omitempty"`

// RBAC configures the RBAC management UI feature. Only honored in

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should leak cloud (internal) implications into the API comments. Our users don't need to be aware/think about those.

Comment thread api/v1/manager_types.go Outdated
// Disabled.
// +optional
// +kubebuilder:validation:Enum=Enabled;Disabled
Mode RBACMode `json:"mode,omitempty"`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of the following field names?

  • Management: Enabled|Disabled
  • UI: Enabled|Disabled

or values: Visible | Hidden

@dimitri-nicolo dimitri-nicolo force-pushed the dimitri-PMREQ-824-rbac-mgmt branch from b9493a6 to 94159c6 Compare May 28, 2026 21:53

// RBAC management UI: when enabled on the Manager CR, the rbacsync
// controller runs inside calico-kube-controllers to reconcile the
// catalog of managed ClusterRoles (per-tier + 32 fine-grained + 6

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This number is not likely to hold up over time, best to keep the comment succinct.

@rene-dekker

rene-dekker commented May 28, 2026

Copy link
Copy Markdown
Member

General comment: the AI generated comments in this code are a bit excessive. An extreme example may be the permissions. The func explains in comments twice which permissions are added + the code to add them. They are making the code less readable. And now we need to also keep the comments up to date when we add more. The operator code is not super complicated, I'd much rather have comments be only used if it adds critical information you couldn't get from simply reading the code (or have your ai read it for you).

Comment thread pkg/render/manager.go Outdated
},
// Escalation coverage for webhook-mod resource roles (create/patch on
// Secrets to wire up webhook auth).
rbacv1.PolicyRule{

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we be more specific with secrets and configmaps and bind them to the namespaces we actually need?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The named-resource access moved to a new Role + RoleBinding in calico-system, where tigera-idp-groups and tigera-idp-ldap-config actually live. The broad create is bound to that one namespace now.

Dropped the tigera-known-oidc-users rule (it lives in tigera-elasticsearch and is already granted by logstorage.go), and tightened the activation gate to !Tenant.MultiTenant() since the IdP resources are pinned to calico-system and the RBAC UI is zero-tenant-only anyway.

Is that what you meant by "bind them to the namespaces we actually need" — or were you pointing at something narrower?

Comment thread pkg/render/manager.go Outdated
@dimitri-nicolo dimitri-nicolo force-pushed the dimitri-PMREQ-824-rbac-mgmt branch from 8833288 to 2b749da Compare May 29, 2026 22:19
Comment thread pkg/render/rbac_management.go Outdated
// management UI's directory cache and the cascading cleanup of managed
// bindings when a group is removed).
{
APIGroups: []string{""},

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one should be in a role.

@radTuti radTuti modified the milestones: v1.43.0, v1.44.0 Jun 12, 2026
dimitri-nicolo and others added 4 commits June 18, 2026 12:14
Add Manager.spec.rbac.ui (Enabled/Disabled enum, defaults to Disabled),
the RBAC/RBACUI types, and the nil-safe Manager.RBACManagementEnabled()
helper. Regenerate deepcopy methods and the Manager CRD manifest.

This is the opt-in toggle for the RBAC management UI feature; the render
and controller wiring that consume it follow in subsequent commits.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add rbac_management.go with the escalation-prevention rule set shared by
the calico-manager and calico-kube-controllers roles, and gate the
feature's RBAC on Manager.spec.rbac.ui == Enabled:

- manager: extend calico-manager-role with the cluster-scoped UI rules
  and add a calico-system Role+RoleBinding scoping Secret/ConfigMap
  access to the IdP resources; both gated on !MultiTenant() (the feature
  is zero-tenant-only). Set RBAC_UI_ENABLED on ui-apis, and allow LDAP
  egress when OIDC authentication is configured.
- apiserver: extend tigera-network-admin with the manage-roles rule.
- kube-controllers: enable the rbacsync controller, grant its rules, and
  add a calico-system Role+RoleBinding reading tigera-idp-groups.

The Manager and Authentication CRs are surfaced through the render
configs here; the controllers that populate them follow.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Move GetManager from the manager controller into pkg/controller/utils so
the apiserver and installation controllers can read the Manager CR
without importing the manager controller package. Behaviour is
unchanged: it keys on tigera-secure, namespacing the lookup only in
multi-tenant mode, and returns the raw client error so callers decide
whether absence is fatal. Update the manager controller and its tests to
the new location.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Read the Manager CR in the installation and apiserver controllers and
pass RBACManagementEnabled into their render configs, and watch the
Manager CR so toggling spec.rbac re-runs each reconcile. The installation
controller only reads it for the enterprise variant; absence is treated
as RBAC management disabled.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@dimitri-nicolo dimitri-nicolo force-pushed the dimitri-PMREQ-824-rbac-mgmt branch from 2b749da to 6af6ea3 Compare June 18, 2026 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants