Skip to content

Conversation

felixge
Copy link
Member

@felixge felixge commented Jul 10, 2025

This resolves the ambiguity when a ScopesProfiles message contains
more than one profiles entry matching the default_sample_type.

Also add a comment to clarify which profile viewers should display by
default. This was previously also ambiguous.

This change creates a minor problem when it comes to ensuring that pprof
profiles can round-trip through otel conversion as it may require the
ScopeProfiles.profiles entries to be reordered in order to honor the
default_sample_type of the original pprof message. Naively converting
the resulting otel profile back to pprof would cause the order of
Profile.sample_types to be changed.

Merging this change indiciates consensus within the profiling group that
this issue should be handled by adding a pprof.profile_order attribute
(name TBD) to the ScopeProfiles.attributes during the initial
pprof->otel conversion. This label will allow converting the resulting
otel profile back to pprof without any information loss.

See gh-633 for additional details.

@felixge felixge requested review from a team July 10, 2025 09:58
@felixge felixge force-pushed the push-rnmukzywmwrs branch from 7a98df8 to 23bdc47 Compare July 10, 2025 09:59
@felixge felixge force-pushed the push-rnmukzywmwrs branch 3 times, most recently from 0601985 to cb85d1c Compare July 10, 2025 10:07
@reyang
Copy link
Member

reyang commented Jul 11, 2025

@felixge might be good to have a changelog entry?

@aalexand
Copy link
Member

@felixge might be good to have a changelog entry?

The profiling signal is still experimental and WIP, we don't add changelog entries for its changes right now. We will start doing this once 1.0 version is out.

@felixge
Copy link
Member Author

felixge commented Jul 14, 2025

In the last SIG meeting I promised that I would provide an example of how a pprof.profile_order (name TBD) label could be used to ensure that we can round trip pprof files without information loss.

Example pprof input (inspired by the Go memory profiles, but simplified):

# Profile
sample_type: [
  {type: "allocations", unit: "bytes"},
  {type: "allocations", unit: "count"},
],
default_sample_type: "allocations/count",
samples: [
  {value: [320, 10]},
  {value: [640, 2]},
],

Converted to OpenTelemetry:

# ProfilesData
resource_profiles: [
  {
    scope_profiles: [
      scope: {
        attributes: [
          {key: "pprof.profile_order": [1, 0]},
        ],
      },
      profiles: [
        {
          sample_type: {type: "allocations", unit: "count"},
          sample: [
            {values: [10]},
            {values: [2]},
          ]
        },
        {
          sample_type: {type: "allocations", unit: "bytes"},
          sample: [
            {values: [320]},
            {values: [640]},
          ]
        }
      ],
    ],
  },
]

This should round trip back to the original pprof file.

However, there is a small ambiguity here when default_sample_type is omitted. pprof says that in this case, clients should default to the last sample (type?) value. We will probably need an additional or more complex label if we want to round trip pprof files that omit the default_sample_type field precisely.

Anyway, I don't think this is a blocker for this PR. It's certainly a solvable problem.

This resolves the ambiguity when a `ScopesProfiles` message contains
more than one `profiles` entry matching the `default_sample_type`.

Also add a comment to clarify which profile viewers should display by
default. This was previously also ambiguous.

This change creates a minor problem when it comes to ensuring that pprof
profiles can round-trip through otel conversion as it may require the
`ScopeProfiles.profiles` entries to be reordered in order to honor the
`default_sample_type` of the original pprof message. Naively converting
the resulting otel profile back to pprof would cause the order of
`Profile.sample_types` to be changed.

Merging this change indiciates consensus within the profiling group that
this issue should be handled by adding a `pprof.profile_order` attribute
(name TBD) to the `ScopeProfiles.attributes` during the initial
pprof->otel conversion. This label will allow converting the resulting
otel profile back to pprof without any information loss.

See [open-telemetrygh-633][] for additional details.

[open-telemetrygh-633]: open-telemetry#633 (comment)
@felixge felixge force-pushed the push-rnmukzywmwrs branch from cb85d1c to a13103f Compare July 14, 2025 10:27
@felixge
Copy link
Member Author

felixge commented Jul 14, 2025

This should be ready for merge now 🙇 . My force-push was a rebase.

@tigrannajaryan tigrannajaryan merged commit 8048149 into open-telemetry:main Aug 6, 2025
15 checks passed
@pellared pellared mentioned this pull request Aug 29, 2025
tigrannajaryan pushed a commit that referenced this pull request Sep 2, 2025
### Changed

- profiles: drop gzip requirement. [#661](#661)
- profiles: avoid `optional` keyword usage. [#659](#659)
- profiles: make `profile_id` optional. [#665](#665)
- profiles: use single `Profile.sample_type` and clarify use of timestamps. [#649](#649)
- all: add notes about the attribute values restrictions. [#683](https://github.com/open-telemetry/opentelemetry-proto/pull/683)<br>⚠️ **IMPORTANT**: These restrictions can be dropped in a future minor release.
- profiles: clarify usage of the zero value as the first element of tables in `ProfilesDictionary`. [#688](#688), [#698](#698)
- profiles: unsigned `time_nanos` and `duration_nanos` in `Profile`. [#692](#692)
- profiles: improve attribute encoding in `ProfilesDictionary`. [#672](#672)
- profiles: simplify profile stack trace representation. [#708](#708)

### Fixed

- examples: fix OTLP JSON Event example body. [#666](#666)
- docs: minor specification fixes around `UNAVAILABLE` and `RetryInfo`. [#669](#669)

### Removed

- profiles: remove `default_sample_type`. [#679](#679)
- profiles: remove `has_*` debug info fields, they are moving to attributes. [#595](#595)
- profiles: remove `Location.is_folded`. [#690](#690)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants