bugsnag: propagate cloud_provider into contextual logging for queued executions#268
bugsnag: propagate cloud_provider into contextual logging for queued executions#268morgan-wowk wants to merge 1 commit into
Conversation
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
65d8571 to
3bb4cc1
Compare
|
CC: @yuechao-qin This will allow you to filter errors to Nebius or GKE. |
3bb4cc1 to
faed3ca
Compare
abe0fb0 to
4730f88
Compare
faed3ca to
a2974f3
Compare
…executions Adds CLOUD_PROVIDER_ANNOTATION_KEY to common_annotations so the annotation key has a single definition in the OSS layer. In the queued-execution processing span, reads the cloud_provider value from task_spec annotations (already in memory) and adds it to the contextual logging context when present. The existing _before_notify hook then includes it automatically in the tangle_context tab on every Bugsnag event raised during that execution's processing.
a2974f3 to
410f807
Compare
| ctx: dict[str, str] = {"execution_id": execution.id} | ||
| cloud_provider = ( | ||
| (execution.task_spec or {}) | ||
| .get("annotations", {}) |
There was a problem hiding this comment.
🤖 This is an AI-generated code review comment.
annotations: None (key present, value None) makes the chained .get crash.
cloud_provider = (
(execution.task_spec or {})
.get("annotations", {}) # returns None, NOT {}, if annotations is None
.get(_CLOUD_PROVIDER_ANNOTATION_KEY)
).get("annotations", {}) only uses the default when the key is missing. If task_spec = {"annotations": None} (key present, value None — common in JSON-roundtripped data where keys are emitted even when unset), .get("annotations", {}) returns None and the trailing .get(...) raises AttributeError: 'NoneType' object has no attribute 'get'. The crash propagates out of execution_logging_context, breaking the with block setup at the call site and silently degrading log correlation for that execution.
Fix: chain or {} after the inner .get too:
cloud_provider = (
((execution.task_spec or {}).get("annotations") or {})
.get(_CLOUD_PROVIDER_ANNOTATION_KEY)
)| from contextlib import contextmanager | ||
| from typing import Any, Optional | ||
|
|
||
| from .. import backend_types_sql as bts |
There was a problem hiding this comment.
🤖 This is an AI-generated code review comment.
New instrumentation → core import is a circular-import hazard.
from .. import backend_types_sql as btsThe instrumentation package was previously dependency-free w.r.t. core model code — bugsnag_instrumentation.py, error_normalization.py, and the rest don't import from ... This is the first instrumentation → core import. If backend_types_sql ever wants to call back into instrumentation for logging (a plausible future change — e.g. logging inside an __post_init__ or validator), you get an import cycle.
The only runtime use of bts here is the type annotation on execution: bts.ExecutionNode. That doesn't need a real import.
Fix: guard with TYPE_CHECKING:
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from .. import backend_types_sql as btsbts.ExecutionNode becomes a forward reference resolved by the type checker, with zero runtime cost and no cycle risk.
| _context_metadata.set(prev_metadata) | ||
|
|
||
|
|
||
| def execution_logging_context(execution: bts.ExecutionNode): |
There was a problem hiding this comment.
🤖 This is an AI-generated code review comment.
No unit tests for execution_logging_context.
The new function has several non-trivial branches: task_spec is None, task_spec without annotations, annotations without the cloud-provider key, the key set to a string, the key set to None, the key with an empty string. None of these is pinned by a test, and the orchestrator call site relies on the helper being correct.
Fix: add a handful of cases in tests/instrumentation/test_contextual_logging.py:
def test_execution_logging_context_no_annotations():
exc = make_execution_node(id="e1", task_spec={})
with contextual_logging.execution_logging_context(exc):
assert contextual_logging.get_context_metadata("execution_id") == "e1"
assert contextual_logging.get_context_metadata("cloud_provider") is None
def test_execution_logging_context_with_cloud_provider():
exc = make_execution_node(
id="e1",
task_spec={"annotations": {"cloud-pipelines.net/orchestration/cloud_provider": "gcp"}},
)
with contextual_logging.execution_logging_context(exc):
assert contextual_logging.get_context_metadata("cloud_provider") == "gcp"
def test_execution_logging_context_annotations_is_none():
# Regression guard for None handling (see other comment).
exc = make_execution_node(id="e1", task_spec={"annotations": None})
with contextual_logging.execution_logging_context(exc):
assert contextual_logging.get_context_metadata("cloud_provider") is None|
|
||
| from .. import backend_types_sql as bts | ||
|
|
||
| _CLOUD_PROVIDER_ANNOTATION_KEY = "cloud-pipelines.net/orchestration/cloud_provider" |
There was a problem hiding this comment.
🤖 This is an AI-generated code review comment.
Module docstring's metadata-key list should mention cloud_provider.
After this PR, cloud_provider starts appearing in logs for queued executions. The module docstring at the top of this file enumerates the common metadata keys (request_id, pipeline_run_id, execution_id, etc.) — but cloud_provider isn't listed. People grepping the docstring to learn "what metadata keys exist?" won't find it.
Fix: add one line to the docstring's "Common metadata keys" list:
- cloud_provider: From task_spec annotations, set by execution_logging_context for orchestrated runs.
|
|
||
| from .. import backend_types_sql as bts | ||
|
|
||
| _CLOUD_PROVIDER_ANNOTATION_KEY = "cloud-pipelines.net/orchestration/cloud_provider" |
There was a problem hiding this comment.
nebius_launchers.py has this same exact constant. Would it make sense to create a shared config/model/etc file that both of them reference against? Or some way to not duplicate the variable?

Adds CLOUD_PROVIDER_ANNOTATION_KEY to common_annotations so the annotation
key has a single definition in the OSS layer.
In the queued-execution processing span, reads the cloud_provider value from
task_spec annotations (already in memory) and adds it to the contextual
logging context when present. The existing _before_notify hook then includes
it automatically in the tangle_context tab on every Bugsnag event raised
during that execution's processing.