Skip to content

Improve platform detection to reduce unknown telemetry values#1655

Draft
hors wants to merge 2 commits into
mainfrom
detect-more-platforms
Draft

Improve platform detection to reduce unknown telemetry values#1655
hors wants to merge 2 commits into
mainfrom
detect-more-platforms

Conversation

@hors

@hors hors commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Replace CRD-based detection (which relied on optional addons being installed) with discovery API group checks and API server URL patterns.

Add detection for AKS, DOKS, OKE, ACK, NKP, Platform9, Tanzu, and Rancher alongside the existing GKE, EKS, and OpenShift detections.

CHANGE DESCRIPTION

Problem:
Short explanation of the problem.

Cause:
Short explanation of the root cause of the issue if applicable.

Solution:
Short explanation of the solution we are providing with this PR.

CHECKLIST

Jira

  • Is the Jira ticket created and referenced properly?
  • Does the Jira ticket have the proper statuses for documentation (Needs Doc) and QA (Needs QA)?
  • Does the Jira ticket link to the proper milestone (Fix Version field)?

Tests

  • Is an E2E test/test case added for the new feature/change?
  • Are unit tests added where appropriate?

Config/Logging/Testability

  • Are all needed new/changed options added to default YAML files?
  • Are all needed new/changed options added to the Helm Chart?
  • Did we add proper logging messages for operator actions?
  • Did we ensure compatibility with the previous version or cluster upgrade process?
  • Does the change support oldest and newest supported PG version?
  • Does the change support oldest and newest supported Kubernetes version?

Replace CRD-based detection (which relied on optional addons being
installed) with discovery API group checks and API server URL patterns.

Add detection for AKS, DOKS, OKE, ACK, NKP, Platform9, Tanzu, and
Rancher alongside the existing GKE, EKS, and OpenShift detections.
Copilot AI review requested due to automatic review settings June 26, 2026 11:00

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the operator’s platform/managed-Kubernetes detection used for telemetry, moving away from CRD-based heuristics toward API discovery group checks and API server host pattern checks to reduce “unknown” platform values.

Changes:

  • Added a hasAPIGroup helper that detects platforms by checking for the presence of API groups via the discovery API.
  • Updated existing platform detection (notably GKE/EKS) to use API groups and API server host patterns rather than optional addon CRDs.
  • Added detection and telemetry labels for AKS, DOKS, OKE, ACK, NKP, Platform9, Tanzu, and Rancher.

Comment on lines +361 to +370
// hasAPIGroup returns true if the cluster exposes the given API group name.
// Uses the discovery API which is available to all authenticated users with no extra RBAC.
func hasAPIGroup(ctx context.Context, cfg *rest.Config, groupName string) bool {
client, err := discovery.NewDiscoveryClientForConfig(cfg)
assertNoError(err)

if err != nil {
return false
}
groups, err := client.ServerGroups()
if err != nil {
assertNoError(err)
return false
Comment on lines 389 to +396
func isEKS(ctx context.Context, cfg *rest.Config) bool {
log := logging.FromContext(ctx)
// EKS API server hostnames always end with .eks.amazonaws.com.
if strings.Contains(cfg.Host, ".eks.amazonaws.com") {
logging.FromContext(ctx).Info("detected EKS environment")
return true
}
return false
}
Comment thread cmd/postgres-operator/main.go Outdated
Comment on lines +470 to +493
case isTanzu(ctx, cfg):
return "tanzu"
case isRancher(ctx, cfg):
return "rancher"
Copilot AI review requested due to automatic review settings June 26, 2026 13:53
@hors hors force-pushed the detect-more-platforms branch from 44406c3 to 6cbdb1e Compare June 26, 2026 13:53

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

Comment on lines 392 to +396
func isEKS(ctx context.Context, cfg *rest.Config) bool {
log := logging.FromContext(ctx)
// crd.k8s.amazonaws.com is registered on all EKS clusters by the AWS controllers.
if hasAPIGroup(ctx, cfg, "crd.k8s.amazonaws.com") {
logging.FromContext(ctx).Info("detected EKS environment")
return true
Comment on lines 365 to +369
client, err := discovery.NewDiscoveryClientForConfig(cfg)
assertNoError(err)

if err != nil {
log.V(1).Info("platform detection: could not create discovery client", "error", err.Error())
return false
}
Comment on lines +361 to +363
// hasAPIGroup returns true if the cluster exposes the given API group name.
// Uses the discovery API; note that hardened clusters may restrict discovery access.
func hasAPIGroup(ctx context.Context, cfg *rest.Config, groupName string) bool {
Comment on lines +428 to +435
func isACK(ctx context.Context, cfg *rest.Config) bool {
// ACK API server hostnames always end with .aliyuncs.com.
if strings.Contains(cfg.Host, ".aliyuncs.com") {
logging.FromContext(ctx).Info("detected ACK environment")
return true
}
return false
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitating a little to add these without ensuring that the API group method works for them

Comment on lines +481 to +496
case isAKS(ctx, cfg):
return "aks"
case isDOKS(ctx, cfg):
return "doks"
case isOKE(ctx, cfg):
return "oke"
case isACK(ctx, cfg):
return "ack"
case isNKP(ctx, cfg):
return "nkp"
case isPlatform9(ctx, cfg):
return "platform9"
case isTanzu(ctx, cfg):
return "tanzu"
case isRancher(ctx, cfg):
return "rancher"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can also avoid this repetition here cause every time we will have to add a function and a call here. We could have a generic logic for calling the hasAPIGroup(ctx, ...) and then iterate on the platforms that can be defined like this:

var platformProbes = []platformProbe{
	{name: "gke", label: "GKE", apiGroups: []string{"networking.gke.io"}},
	{name: "eks", label: "EKS", apiGroups: []string{"crd.k8s.amazonaws.com"}},
	{name: "aks", label: "AKS", hosts: []string{".azmk8s.io"}},....

@hors hors force-pushed the detect-more-platforms branch from 6cbdb1e to ef39113 Compare June 29, 2026 10:14
@JNKPercona

Copy link
Copy Markdown
Collaborator
Test Name Result Time
backup-enable-disable passed 00:13:03
builtin-extensions passed 00:06:21
cert-manager-tls passed 00:09:29
custom-envs passed 00:19:46
custom-tls passed 00:06:14
database-init-sql passed 00:02:38
demand-backup passed 00:23:43
demand-backup-offline-snapshot passed 00:13:40
dynamic-configuration passed 00:03:25
finalizers passed 00:03:51
init-deploy passed 00:03:28
huge-pages passed 00:03:16
major-upgrade-14-to-15 passed 00:11:20
major-upgrade-15-to-16 passed 00:10:31
major-upgrade-16-to-17 passed 00:11:40
major-upgrade-17-to-18 passed 00:09:18
ldap passed 00:05:59
ldap-tls passed 00:06:59
monitoring passed 00:08:27
monitoring-pmm3 passed 00:09:19
one-pod passed 00:06:21
operator-self-healing passed 00:10:31
pitr passed 00:11:58
scaling passed 00:05:57
scheduled-backup passed 00:27:29
self-healing passed 00:09:26
sidecars passed 00:03:07
standby-pgbackrest passed 00:18:28
standby-streaming passed 00:13:27
start-from-backup passed 00:14:33
tablespaces passed 00:08:21
telemetry-transfer passed 00:05:19
upgrade-consistency passed 00:06:34
upgrade-minor passed 00:08:46
users passed 00:05:15
Summary Value
Tests Run 35/35
Job Duration 01:49:12
Total Test Time 05:38:18

commit: ef39113
image: perconalab/percona-postgresql-operator:PR-1655-ef3911315

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants