Skip to content

Conversation

MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented Jun 3, 2025

Distinguish all vs local events being persisted in the "Event Send Time Quantiles" graph in Grafana

Before After (notice the green dots for local events)
Event Send Time graph with ambiguous 'Events' Event Send Time graph with all vs local event persist rates

As discovered by @devonh, we use synapse_storage_events_persisted_events_total (which tracks all persisted events) for the "Events" rate in the "Event Send Time Quantiles" graph. This is pretty misleading as I would expect it to be the rate of events being sent given the graph title, "Event Send Time Quantiles".

Since the event persistence queues are shared for local and remote events from federation and will block local events being sent, I think it does still make sense to have the event persist rate. I've updated the graph to include the rate of "Local events being persisted" and the rate of "All events being persisted". I think this properly disambiguates and clarifies what the graph is trying to show.

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct (run the linters)

{
"$$hashKey": "object:329",
"color": "#B877D9",
"hideTooltip": true,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the event persist rate back to the tooltip so it's possible to get an accurate reading.

Hovering over the Event Send Time graph showing the tooltip which now includes the event persist rate

"yBucketBound": "auto"
},
{
"aliasColors": {},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The random changes are just whatever Grafana 10.4.3 is deciding to do (https://grafana.vpn.infra.element.dev/api/health)

For reference, we're using Grafana 10.4.1 on https://grafana.matrix.org/api/health

@MadLittleMods MadLittleMods marked this pull request as ready for review June 3, 2025 19:41
@MadLittleMods MadLittleMods requested a review from a team as a code owner June 3, 2025 19:41
Copy link
Contributor

@reivilibre reivilibre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks :-)

@MadLittleMods MadLittleMods merged commit e80bc4b into develop Jun 5, 2025
35 checks passed
@MadLittleMods MadLittleMods deleted the madlittlemods/local-vs-all-events-being-persisted-in-event-send-time-graph-in-grafana branch June 5, 2025 20:30
@MadLittleMods
Copy link
Contributor Author

Thanks for the review @reivilibre 🐌

Michael-Ixo pushed a commit to ixoworld/synapse that referenced this pull request Jun 25, 2025
- Improvements to generate config documentation from JSON Schema file. ([\element-hq#18522](element-hq#18522))

- Add support for [MSC4155](matrix-org/matrix-spec-proposals#4155) Invite Filtering. ([\element-hq#18288](element-hq#18288))
- Add experimental `user_may_send_state_event` module API callback. ([\element-hq#18455](element-hq#18455))
- Add experimental `get_media_config_for_user` and `is_user_allowed_to_upload_media_of_size` module API callbacks that allow overriding of media repository maximum upload size. ([\element-hq#18457](element-hq#18457))
- Add experimental `get_ratelimit_override_for_user` module API callback that allows overriding of per-user ratelimits. ([\element-hq#18458](element-hq#18458))
- Pass `room_config` argument to `user_may_create_room` spam checker module callback. ([\element-hq#18486](element-hq#18486))
- Support configuration of default and extra user types. ([\element-hq#18456](element-hq#18456))
- Successful requests to `/_matrix/app/v1/ping` will now force Synapse to reattempt delivering transactions to appservices. ([\element-hq#18521](element-hq#18521))
- Support the import of the `RatelimitOverride` type from `synapse.module_api` in modules and rename `messages_per_second` to `per_second`. ([\element-hq#18513](element-hq#18513))

- Remove destinations from sending if not whitelisted. ([\element-hq#18484](element-hq#18484))
- Fixed room summary API incorrectly returning that a room is private in the room summary response when the join rule is omitted by the remote server. Contributed by @nexy7574. ([\element-hq#18493](element-hq#18493))
- Prevent users from adding themselves to their own user ignore list. ([\element-hq#18508](element-hq#18508))

- Generate config documentation from JSON Schema file. ([\element-hq#17892](element-hq#17892))
- Mention `CAP_NET_BIND_SERVICE` as an alternative to running Synapse as root in order to bind to a privileged port. ([\element-hq#18408](element-hq#18408))
- Surface hidden Admin API documentation regarding fetching of scheduled tasks. ([\element-hq#18516](element-hq#18516))
- Mark the new module APIs in this release as experimental. ([\element-hq#18536](element-hq#18536))

- Mark dehydrated devices in the [List All User Devices Admin API](https://element-hq.github.io/synapse/latest/admin_api/user_admin_api.html#list-all-devices). ([\element-hq#18252](element-hq#18252))
- Reduce disk wastage by cleaning up `received_transactions` older than 1 day, rather than 30 days. ([\element-hq#18310](element-hq#18310))
- Distinguish all vs local events being persisted in the "Event Send Time Quantiles" graph (Grafana). ([\element-hq#18510](element-hq#18510))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants