feat: Add enrichers for attribute-based telemetry#5725
feat: Add enrichers for attribute-based telemetry#5725sentrivana wants to merge 2 commits intomasterfrom
Conversation
| attributes["http.request.method"] = asgi_scope.get("method") | ||
|
|
||
| headers = _filter_headers(_get_headers(asgi_scope)) | ||
| # TODO[span-first]: Correctly merge headers if duplicate | ||
| for header, value in headers.items(): | ||
| attributes[f"http.request.headers.{header.lower()}"] = [value] | ||
|
|
||
| attributes["http.query"] = _get_query(asgi_scope) |
There was a problem hiding this comment.
Null values assigned to attributes may cause downstream issues
Both asgi_scope.get('method') (line 119) and _get_query(asgi_scope) (line 126) can return None, but the Attributes type (defined in _types.py) does not include None as a valid AttributeValue. While the return type annotation is Dict[str, Any], the internal annotation uses Attributes, and downstream consumers expecting valid attribute values may encounter unexpected None values that don't conform to OpenTelemetry attribute semantics.
Verification
Read _types.py lines 226-240 confirming AttributeValue excludes None. Read _get_query function (lines 60-67) confirming it returns None when query string is falsy. Verified asgi_scope.get('method') can also return None.
Suggested fix: Only set attributes when values are not None
| attributes["http.request.method"] = asgi_scope.get("method") | |
| headers = _filter_headers(_get_headers(asgi_scope)) | |
| # TODO[span-first]: Correctly merge headers if duplicate | |
| for header, value in headers.items(): | |
| attributes[f"http.request.headers.{header.lower()}"] = [value] | |
| attributes["http.query"] = _get_query(asgi_scope) | |
| method = asgi_scope.get("method") | |
| if method is not None: | |
| attributes["http.request.method"] = method | |
| query = _get_query(asgi_scope) | |
| if query is not None: | |
| attributes["http.query"] = query |
Identified by Warden code-review · ZV9-3LC
| segment = telemetry.segment | ||
| source = segment.get_attributes.get("sentry.span.source") | ||
| already_set = ( | ||
| segment is not None | ||
| and segment.name != _DEFAULT_TRANSACTION_NAME | ||
| and source | ||
| in [ | ||
| SegmentSource.COMPONENT, | ||
| SegmentSource.ROUTE, | ||
| SegmentSource.CUSTOM, | ||
| ] | ||
| ) |
There was a problem hiding this comment.
Null check for segment occurs after segment is accessed, causing potential AttributeError
Line 394 accesses segment.get_attributes before the null check on line 395-397 (segment is not None). If telemetry.segment were ever None, line 394 would raise AttributeError: 'NoneType' object has no attribute 'get_attributes' before the null check executes. The check should happen before accessing segment's attributes. While current StreamedSpan implementation always sets _segment to a non-None value, this is a defensive programming issue that could cause crashes if the implementation changes.
Verification
Read lines 393-404 in the hunk: line 393 assigns segment = telemetry.segment, line 394 accesses segment.get_attributes, but segment is not None check is part of the condition on lines 395-397. The access precedes the check. Verified StreamedSpan.init (traces.py line 268) sets self._segment = segment or self so segment is never None for valid StreamedSpan, but the code structure is incorrect.
Identified by Warden find-bugs · QDC-MFJ
Description
Event processors are responsible for enriching events with data, contexts, etc. at the end of the pipeline. They only apply to event-based telemetry (transactions, errors, etc.).
There is currently no replacement for event processors for attributes-based telemetry, most importantly for streamed spans.
This PR introduces the concept of enrichers, which are essentially event processors for streamed spans, and showcases them by implementing one global enricher (stdlib) and one local (asgi).
Note that event processors will NOT be removed in the next major -- even if the SDK is not sending transaction-based traces anymore, event processors are still used for enriching other events, like errors. So the two concepts will need to co-exist.
Regarding naming: something like "telemetry processors" or "span processors" would've been nicer and more in line with the old naming style than "enricher". However, both of those refer to different concepts (telemetry processor is the new telemetry buffer concept, span processor is an OTel term), so in the interest of avoiding confusion I came up with a different name.
Issues
Closes #5722
Reminders
tox -e linters.feat:,fix:,ref:,meta:)