-
Notifications
You must be signed in to change notification settings - Fork 3.6k
feat: Adds property based testing framework for compaction #26783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tsdb/engine/tsm1/compact_test.go
Outdated
@@ -4517,6 +4498,10 @@ func TestEnginePlanCompactions(t *testing.T) { | |||
e.Scheduler.SetDepth(1, mockGroupLen) | |||
e.Scheduler.SetDepth(2, mockGroupLen) | |||
|
|||
// Validate our test case based on compact_property_test.go | |||
validationErr := ValidateTestCase(test, test.expectedResult()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like this is validating the expectedResults() twice: once as they are extracted from test
and then again as the second argument.
It should be validationErr := ValidateTestCase(test, level1Groups, level2Groups, Level3Groups, Level4Groups, Level5Groups))
where the groups come from e.PlanCompactions()
... line 4526/4511
This seems unnecessarily complex for what it does. Would something as simple and stupid as https://goplay.tools/snippet/sKHiARnmzi9 work instead? |
betweenGroup, inGroup := fileToGroup[betweenFile.filename] | ||
|
||
if inGroup && betweenGroup != groupIdx { | ||
return fmt.Errorf("gap detected: files %s and %s are in group %d, but file %s (between them) is in group %d", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expected results: No Gaps Rule violation: gap detected: files 04-04.tsm and 09-04.tsm are in group 1, but file 05-03.tsm (between them) is in group 0
This error message is tough as the group indices don't tell me anything about which level group it came from. I think it should know which level group its talking about.
also: "no gaps rule" is odd description of the property. Maybe its me: the rule is that tsm files within a group must be adjacent in the master list of files. "Adjacency rule" seems more accurate. a "gap" sounds like a missing tsm file instead of one in the wrong place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think the validation should tell me if the problem is in the expected results (aka I made a bad test case) or in the output of the compaction planner (the compaction planner is wrong). I know if the expected results are correct and valid and the compaction planner produces a bad plan, the test case will still fail, but more information is always helpful.
nvm -- i see that it does this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, it'd be nice if all instances of the issues were reported instead of stopping at the first adjacency rule violation.
tsdb/engine/tsm1/compact_test.go
Outdated
level4Groups: Level4Groups, | ||
level5Groups: Level5Groups, | ||
}) | ||
require.NoError(t, validationErr, "test validation failed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let the rest of the test run; assert don't require.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good 👍
// validateNoGaps checks that there are no gaps within the same compaction level | ||
// A gap occurs when files A and C are in the same level group, but file B (between A and C) | ||
// is in a different level group, creating a "hole" in the sequence | ||
func validateNoGaps(allFiles []string, groups []tsm1.CompactionGroup) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the algorithm in here is strange; why did you write it this way? I believe it works but it seems more complicated than it needs to be.
It also sorts the files that are in the input CompactionGroups... which hides errors of files being out of order in the group (yes this should get caught elsewhere too).
I don't understand why the algorithm isn't:
- parse the master file list filenames to get gens and seqs
- sort by generation and then seq within generation into a "sorted master slice"
- invert slice to a map[filename]->index in sorted master list
- for each group, for each file in the group check if file[i] has an index in the master map that is -1 from file[i+1] in the master map.
This will fail if the group's files aren't sorted or if the group's files aren't adjacent.
e.g.
sorted master files list -> index in sorted slice
0-5 -> 0
1-5 -> 1
2-4 -> 2
3-4 -> 3
4-4 -> 4
5-2 -> 5
6-1 -> 6
7-1 -> 7
Any group has to be a strict subset.
Group0: <-- PASS
1-5 -> 1
2-4 -> 2
Group1:
3-4 -> 3
4-4 -> 4
6-1 -> 6 <--- failure. index 6 is not 4+1. Adjacency failure.
Group2:
6-1 -> 6
5-2 -> 5 <--- failure. index 5 is not 6+1; group file ordering failure.
For the Group1, it'd be easy to print the files in between because you have the index 4 and index 6, so you print the file at index 5 e.g.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nice. I just noticed that @davidby-influx proposed exactly the same algorithm 6 hours ago in his comment.
While rewriting the algorithm for this I stumbled upon this test case (which I believe is wrong)
Notice the generation 3 TSM file at the end of the test input. This would be a violation. The property test validator should account for this as well I assume? I'm removing complexity, getting rid of the sort for input that would cause this to pass, and cleaning this up. |
The input tsm file list has a duplicate named tsm file, namely For this pr; please focus on just the adjacency aspect; let's get that right and merged so we can return to PF; we can add other checks (unique list in input files, sorted files in groups, etc) in separate prs -- they are checks are properties too, which is nice, but they are less important. |
The most important thing is that we don't change the ordering of input so this will stop that, as we check the index anyways 🤔
func validateFileAdjacency(allFiles []string, groups []tsm1.CompactionGroup) error { | ||
var inputFiles []fileInfo | ||
for i, file := range allFiles { | ||
gen, seq, err := tsm1.DefaultParseFileName(file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you're not going to sort the input list, why do we need to parse the filename? We only need gen and seq for sorting the input file list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we sort the input list? For tests, we know the files will be in order, but for random input from production code, does Glob
have order guarantees?
|
||
var fileMap = make(map[string]fileInfo, len(inputFiles)) | ||
for _, file := range inputFiles { | ||
fileMap[file.fileName] = file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without the sorting this can be just
for i, file := range allFiles {
fileMap[file] = i
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, I can probably condense this down and just build the map directly now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a couple minor changes but please make them.
|
||
// validateFileAdjacency checks that there are no gaps between compaction groups | ||
// An adjacency violation occurs when files A and C are in different groups, but file B (between A and C) | ||
// is also in a different group, creating overlapping or non-contiguous ranges |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is confusing to me. It says if A and C are in different compaction groups and B is in a different group too that's a problem. I don't think that's a problem.
All the files within a planned compaction group must be adjacent with respect to the master list of files (aka allFiles
).
Would you update this comment please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this sound?
// validateFileAdjacency checks that there are no adjacency violations between TSM files
// The following example will highlight an adjacency violation:
// Given the following list of files [01-01.tsm, 02-02.tsm, 03-03.tsm] let's say we have the following compaction plans created
// Group 1: [01-01.tsm, 03-03.tsm] & Group 2: [02-02.tsm]
// This violates file adjacency as the first compaction group sees [01-01.tsm, X, 03-03.tsm] and the second group: [X, 02-02.tsm, X]
// these are non-contiguous blocks, when the first group performs compaction we will have two files that are out of order compacted together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good.
You could also mention that we have this rule because the ordering of the files indicates which overwrite of a data point wins (newest write wins).
var errs []error | ||
|
||
// Extract all filenames from test case | ||
var allFiles []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since you're going to touch this for the comment change, please preallocate this since you know how big it will be. ci pays for allocations too.
And there are some failing tests according to ci that need to be addressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments, perhaps actionable, but worth understanding the issues
Validator: validateFileAdjacency, | ||
} | ||
|
||
type fileInfo struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for the wrapping struct
?
func validateFileAdjacency(allFiles []string, groups []tsm1.CompactionGroup) error { | ||
var inputFiles []fileInfo | ||
for i, file := range allFiles { | ||
gen, seq, err := tsm1.DefaultParseFileName(file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we sort the input list? For tests, we know the files will be in order, but for random input from production code, does Glob
have order guarantees?
// An adjacency violation occurs when files A and C are in different groups, but file B (between A and C) | ||
// is also in a different group, creating overlapping or non-contiguous ranges | ||
func validateFileAdjacency(allFiles []string, groups []tsm1.CompactionGroup) error { | ||
var fileMap = make(map[string]fileInfo, len(allFiles)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we sort files? Tests will surely have the correct order, but if we use this in Production code, does Glob
have ordering promises?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glob
actually does not have ordering promises in production. When I was implementing a psuedo
version of this in the PR where I implemented adjacency checks (https://github.com/influxdata/influxdb/pull/26766/files#diff-e0d6c334f09b26e18b81206ad3b242f2fd1ef09a7b23ea326a13eea3ac916a9aR2478-R2483) I did have to sort, if we were to ever pull this out and utilize it for other than tests I think that it would be beneficial to keep the sorting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests will surely have the correct order
apparently not!
as well as a better description of the issue. Re-add sorting
@@ -3519,15 +3500,15 @@ func TestEnginePlanCompactions(t *testing.T) { | |||
}, | |||
{ | |||
FileStat: tsm1.FileStat{ | |||
Path: "03-03.tsm1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a duplicate tsm file in this test. there are two 03-03.tsm1
files in the input list.
The question is what do we think the intent of the test is. Do we
- delete the duplicate
- rename it as 05-03
- rename it as 04-03
I think the intent was that this should be the third file in the 04 generation as the 04 generation is a level 1 generation. By making this file 05-03... there's now a level 3 generation after the level 1 generation. I don't think that's the intent of this test (although it is an interesting test case).
please rename to 04-03
return testLevelResults{ | ||
level1Groups: []tsm1.PlannedCompactionGroup{ | ||
{ | ||
tsm1.CompactionGroup{"05-01.tsm", "06-01.tsm", "07-01.tsm", "08-01.tsm", "09-01.tsm", "10-01.tsm", "11-01.tsm", "12-01.tsm"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, what is the intent of the original test. Mock another planned level inside scheduler aggress blocks middle
is tough for me to understand, but you've significantly changed the expectations of this test so i think you've changed its intent too much. There's no level1 plan at all anymore!
I think you need to keep the tsm files the same names but properly sort them in files: []tsm1.ExtFileStat{
test setup. That should keep the test expectations unchanged. Please make it so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Sort the input test files, but change nothing else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @philjb suggests
return testLevelResults{ | ||
level1Groups: []tsm1.PlannedCompactionGroup{ | ||
{ | ||
tsm1.CompactionGroup{"05-01.tsm", "06-01.tsm", "07-01.tsm", "08-01.tsm", "09-01.tsm", "10-01.tsm", "11-01.tsm", "12-01.tsm"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Sort the input test files, but change nothing else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Adds property based testing framework for our compaction testings.
Current we support checking for "gaps" within adjacent compaction groups.
For example:
Would be invalid and cause our validator to fail due to the file
"000017095-000000005.tsm"
inlevel5Groups
.