Skip to content

Conversation

wbollock
Copy link
Collaborator

This allows for more expensive SLO recording rules by letting users create a custom Prometheus rule_group.interval.

https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/#rule_group

Now instead of all rule groups using the default global evaluation interval, a custom interval can be set on all sets of recording rules for an SLO.

If no interval is set then the global default will be assumed, matching current Sloth behavior.

Resolves: slok#367

This allows for more expensive SLO recording rules by letting users
create a custom Prometheus `rule_group.interval`.

https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/#rule_group

Now instead of all rule groups using the default global evaluation
interval, a custom interval can be set on all sets of recording rules
for an SLO.

If no `interval` is set then the global default will be assumed,
matching current Sloth behavior.

Resolves: slok#367
@wbollock wbollock marked this pull request as draft June 14, 2023 00:02
wbollock added 3 commits June 13, 2023 20:06
Related to how the unit test coverage works, this will make sure
RuleGroupInterval is not appended to RuleGroups unless it is defined and
not empty.

Not requring the `yaml` file does an okay job of this but expliclity
writing the rules without RuleGroupInterval is safer.

It's a bit ugly and repetitive but I think that's just how Go works..
This updates tests include `interval`'s and also includes a test to make
sure the rules render correctly when interval is not included.
@wbollock wbollock marked this pull request as ready for review June 14, 2023 21:55
Copy link
Member

@wbh1 wbh1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good! Since Sloth produces 3 rules groups, I think it'd be nice to support specifying separate intervals for each of them, though. (sloth-slo-meta-recordings, sloth-slo-alerts, and sloth-slo-sli-recordings are the 3 prefixes). For example, you may want an interval of 2m on the SLI recording rules but still 1m on the alerts.

@wbollock wbollock requested a review from wbh1 June 26, 2023 21:23
wbollock added 3 commits June 26, 2023 17:25
Instead of a singular global default, now a rule_group interval can be
set for every individual type of rule_group Sloth generates.

The generic, `interval:all` rule will also stay and can "fill in" any missing per-rule
group defaults. Along with the default behavior of doing nothing if no
`interval` is specified.
Specific ones and general/specific combined
@wbollock wbollock force-pushed the feat/rule_group_interval branch from eb73e69 to dc3994b Compare June 26, 2023 21:28
@wbollock wbollock closed this Jun 26, 2023
@wbollock wbollock reopened this Jun 26, 2023
@wbollock
Copy link
Collaborator Author

@wbh1 commits added for per rule group intervals:
a94fee9
ca78450
dc3994b

Copy link
Member

@wbh1 wbh1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving as it looks good as-is, but left some comments on form.

Comment on lines 53 to 80
// 0s is default empty string value for time.Duration
if slo.SLO.RuleGroupInterval.String() != "0s" || slo.SLO.SLIErrorRulesInterval.String() != "0s" {
var ruleGroupIntervalDuration prommodel.Duration
var err error

// if we have a valid meta rule rule_group interval, use that first and overwrite any generic ones
if slo.SLO.SLIErrorRulesInterval.String() != "0s" {
ruleGroupIntervalDuration, err = prommodel.ParseDuration(slo.SLO.SLIErrorRulesInterval.String())
} else {
ruleGroupIntervalDuration, err = prommodel.ParseDuration(slo.SLO.RuleGroupInterval.String())
}

if err != nil {
return fmt.Errorf("could not parse rule_group interval duration %w", err)
}

ruleGroups.Groups = append(ruleGroups.Groups, ruleGroupYAMLv2{
Name: fmt.Sprintf("sloth-slo-sli-recordings-%s", slo.SLO.ID),
RuleGroupInterval: ruleGroupIntervalDuration,
Rules: slo.Rules.SLIErrorRecRules,
})
} else {
ruleGroups.Groups = append(ruleGroups.Groups, ruleGroupYAMLv2{
Name: fmt.Sprintf("sloth-slo-sli-recordings-%s", slo.SLO.ID),
Rules: slo.Rules.SLIErrorRecRules,
})

}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can simplify this with a switch statement like:

    group := ruleGroupYAMLv2{
					Name:             fmt.Sprintf("sloth-slo-sli-recordings-%s", slo.SLO.ID),
					Rules:             slo.Rules.SLIErrorRecRules,
				}
	switch {
	case AlertRulesInterval.String() != "0s":
		ruleGroupIntervalDuration, err = time.ParseDuration(AlertRulesInterval.String())
        if err != nil {
            return fmt.Errorf("could not parse rule_group interval duration for alerts %w", err)
        } else {
            group.RuleGroupInterval = ruleGroupIntervalDuration
        }
	case RuleGroupInterval.String() != "0s":
		ruleGroupIntervalDuration, err = time.ParseDuration(RuleGroupInterval.String())
        if err != nil {
            return fmt.Errorf("could not parse default ("all") rule_group interval duration %w", err)
        } else {
            group.RuleGroupInterval = ruleGroupIntervalDuration
        }
	}

    ruleGroups.Groups = append(ruleGroups.Groups, group)

@wbollock wbollock merged commit bd62a45 into linode-obs:main Jul 3, 2023
@wbollock wbollock deleted the feat/rule_group_interval branch July 3, 2023 13:22
@wbollock wbollock restored the feat/rule_group_interval branch July 3, 2023 13:27
cxdy referenced this pull request in cxdy/sloth Oct 29, 2024
cxdy added a commit that referenced this pull request Oct 31, 2024
…issues

Dennisme/bump versions fix build issues
cxdy added a commit to cxdy/sloth that referenced this pull request Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ability to specify rule group interval
2 participants