Skip to content

Add launcher type and cluster URL to root execution span#255

Merged
morgan-wowk merged 1 commit into
masterfrom
execution-tracing-launcher-type
Jun 5, 2026
Merged

Add launcher type and cluster URL to root execution span#255
morgan-wowk merged 1 commit into
masterfrom
execution-tracing-launcher-type

Conversation

@morgan-wowk

@morgan-wowk morgan-wowk commented May 22, 2026

Copy link
Copy Markdown
Collaborator

Add execution.launcher and execution.cloud_provider OTel trace attributes to root execution span

execution.launcher is derived from the top-level key of launcher_data (e.g. kubernetes, kubernetes_job, skypilot) and distinguishes the launcher mechanism used for a given execution.

k8s.cluster.url on the root span allows GKE vs Nebius cluster identification by URL pattern in oasis-backend's multi-cloud setup, populated from cluster_server inside the launcher's data block.

execution.cloud_provider is read from the cloud-pipelines.net/orchestration/cloud_provider task_spec annotation, set at routing time by callers such as MultiLauncherContainerLauncher. This enables traces to be searched by cloud provider (e.g. gke, nebius) without relying on URL pattern matching against cluster hostnames. Launchers with a fixed cloud affinity can also set this annotation directly.

The CLOUD_PROVIDER_ANNOTATION_KEY constant is defined in common_annotations so it can be shared across launchers and the tracing layer without duplication.

Screenshots

Screenshot 2026-05-22 at 8.19.10 PM.png

@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-type branch 2 times, most recently from 57ba606 to 54e0001 Compare May 23, 2026 00:05
@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-data branch from 3aa355d to 3e85d85 Compare May 23, 2026 00:31
@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-type branch from 54e0001 to 6f8a30a Compare May 23, 2026 00:31
@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-data branch from 3e85d85 to 239a70c Compare May 23, 2026 02:57
@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-type branch from 6f8a30a to a8e9b03 Compare May 23, 2026 02:57
@morgan-wowk morgan-wowk marked this pull request as ready for review May 23, 2026 03:22
@morgan-wowk morgan-wowk requested a review from Ark-kun as a code owner May 23, 2026 03:22
@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-data branch from 239a70c to 6521dc7 Compare May 26, 2026 23:27
@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-type branch from a8e9b03 to 5a111d0 Compare May 26, 2026 23:27
@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-data branch from 6521dc7 to 374f9cf Compare May 27, 2026 00:22
@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-type branch from 5a111d0 to 3baf255 Compare May 27, 2026 00:22

@yuechao-qin yuechao-qin left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this PR also distinguish between Nebius or not? Can you then filter on

cloud_provider = Nebius && status = <SOMETHING, Maybe Failures>?

Comment thread tests/instrumentation/test_execution_tracing.py Outdated
@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-type branch from 3baf255 to d7605f3 Compare May 29, 2026 18:52
@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-data branch from 374f9cf to 22df75e Compare May 29, 2026 18:52
@morgan-wowk

Copy link
Copy Markdown
Collaborator Author

🤖 Exactly — with this PR you can filter traces on execution.cloud_provider = "nebius" && execution.status = "FAILED" (or any status). The cloud_provider comes from the task_spec annotation set at routing time by the MultiLauncherContainerLauncher in oasis-backend.

@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-data branch from 22df75e to b7aa164 Compare May 29, 2026 19:03
@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-type branch from d7605f3 to 4871348 Compare May 29, 2026 19:04
Comment thread tests/instrumentation/test_execution_tracing.py Outdated
@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-data branch from b7aa164 to eee983b Compare June 1, 2026 20:02
@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-type branch from 4871348 to 65e59b8 Compare June 1, 2026 20:02
@morgan-wowk morgan-wowk requested a review from yuechao-qin June 1, 2026 20:16
@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-type branch from 65e59b8 to bb72260 Compare June 2, 2026 23:53

morgan-wowk commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator Author

Merge activity

  • Jun 5, 5:07 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Jun 5, 5:11 PM UTC: Graphite rebased this pull request as part of a merge.
  • Jun 5, 5:11 PM UTC: @morgan-wowk merged this pull request with Graphite.

@morgan-wowk morgan-wowk changed the base branch from execution-tracing-launcher-data to graphite-base/255 June 5, 2026 17:08
@morgan-wowk morgan-wowk changed the base branch from graphite-base/255 to master June 5, 2026 17:09
execution.launcher = top-level key from launcher_data (e.g. 'kubernetes',
'kubernetes_job', 'skypilot') distinguishes launcher mechanism.
k8s.cluster.url on root span allows GKE vs Nebius cluster identification
by URL pattern in oasis-backend's multi-cloud setup.
@morgan-wowk morgan-wowk force-pushed the execution-tracing-launcher-type branch from bb72260 to 6151e3c Compare June 5, 2026 17:10
@morgan-wowk morgan-wowk merged commit ec3decd into master Jun 5, 2026
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants