-
Notifications
You must be signed in to change notification settings - Fork 38
fix: error message should not be set with other fields #710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: error message should not be set with other fields #710
Conversation
The server should just return the flag if the agent is unidentified (service.name not known).
cc @trentm @LikeTheSalad (please, let me know if there's an APM wide Github group I can mention) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cheers!
I think @elastic/apm-agent-approvers should do for this PR. |
rc.logger.Error(fmt.Sprintf("error retrieving remote configuration: %s", err), agentUidField) | ||
// remote config client could not identify the agent | ||
if errors.Is(err, apmconfig.UnidentifiedAgent) { | ||
serverToAgent.Flags = uint64(protobufs.ServerToAgentFlags_ServerToAgentFlags_ReportFullState) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of the OpAMP spec is that it is not appropriate to be using the ReportFullState
flag for this case -- where the AgentToServer.agent_description.identifying_attributes happens to not have some key that this particular OpAMP server wants to see (in this case service.name
). The OpAMP spec suggests that agents "SHOULD" include service.name
, but there aren't any required identifying_attributes
.
The ReportFullState
flag is entirely meant to allow Agent Status Compression, so that the typical AgentToServer
message can be small (skipping details that the server likely already has for the agent).
For example, if the server receives an AgentToServer
message and its local info for that agent (using the instanceUid
) does not include an AgentDescription
at all, then the server should respond including the ReportFullState
flag.
(This OpAMP server impl should also be using the sequenceNum
to decide whether to potentially set the ReportFullState
flag as well. I found the example server in opamp-go.git illustrative here. See https://github.com/open-telemetry/opamp-go/blob/4e0574d43367de4a6ec7ac5e9df770d57faa8ca2/internal/examples/server/opampsrv/opampsrv.go#L109-L110 and https://github.com/open-telemetry/opamp-go/blob/4e0574d43367de4a6ec7ac5e9df770d57faa8ca2/internal/examples/server/data/agent.go#L233-L287)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see that the apmconfig.UnidentifiedAgent
error is currently doing double-duty:
- It is handling the case where
loadedAgent
has no value at all foridentifyingAttributes
, which would happen when the OpAMP server is restarted and agents are sending periodic AgentToServer messages without theagent_description
property. In this case the server should respond with ReportFullState. - It is also handling the case where the
loadedAgent
does haveidentifyingAttributes
, but those attributes do not includeservice.name
. In this case the server should not respond with ReportFullState.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The opamp_callbacks.go file is a generic implementation of an OpAMP server for remote configurations, but it is up to the RemoteConfig client implementation to decide wheter to return an apmconfig.UnidentifiedAgent
error or not.
In the case of the (only) elasticsearch fetcher implementation, it assumes that the Elastic SDKs will always include the service.name
. Thus, the Elasticserach fetcher implementation of the RemoteConfig client does not comply to the following OpAMP spec suggestion:
The OpAMP spec suggests that agents "SHOULD" include service.name, but there aren't any required identifying_attributes.
If Elastic SDKs are able to omit the service.name
resource attribute in the identifyingAttributes
field, perhaps it would be best to open a separate issue where we can create and discuss the fix (and the correct usage of ReportFullState
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: #720
The server should just return the flag if the agent is unidentified (service.name not known).
Fixes #708