Skip to content

split up large rule files into smaller files when exceeding 4MiB #634

@cxdy

Description

@cxdy

Description

We have a lot of SLOs, a large amount of them are for a singular system class which results in us having a massive rules file for these specific SLOs. Unfortunately, this breaks in Kubernetes when our ConfigMap(s) exceed 4MiB and causes our custom SlothSLOGenerationFailure alert to fire, which is defined with the following expression:

expr: "sum(rate(kooper_controller_processed_event_duration_seconds_count{job=\"prometheus/sloth-kube-prometheus\",success=\"false\"}[30m])) > 0"

Ideally, we should be able to detect if a rule file is larger than 4MiB and if so, split it up into multiple files (ConfigMaps)

Seems to be hardcoded in the prometheus-operator: https://github.com/prometheus-operator/prometheus-operator/blob/370a2ea18a48000e2ea4bc05acb093502915f5c9/pkg/operator/rules.go#L55-L59

https://github.com/prometheus-operator/prometheus-operator/blob/370a2ea18a48000e2ea4bc05acb093502915f5c9/pkg/operator/rules.go#L192-L196

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions