Skip to content

Conversation

Fricounet
Copy link
Contributor

The walkBlob function was called with a function having a side effect: appending a descriptor to a list. However, the function could be called multiple times for the same descriptor (once in walkBlob and once in walkBlobVariantsOnly in that case) leading to duplicate entries in the final list. Instead of the list, we can use a map to store the descriptors. With the digest as key, it will ensure we don't store duplicates.

This fixes one of the problem encountered in #6088

cc @ktock

@Fricounet Fricounet force-pushed the fricounet/upstream/fix-digest-duplication branch from 92d0f5a to 34ade39 Compare July 17, 2025 08:59
Copy link
Collaborator

@ktock ktock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
Please squash the commits.

@Fricounet Fricounet force-pushed the fricounet/upstream/fix-digest-duplication branch from 30b33bc to 0858b9e Compare July 17, 2025 11:37
@Fricounet
Copy link
Contributor Author

@ktock done, thanks for the review 🙇

@Fricounet
Copy link
Contributor Author

@ktock any idea why the tests would be failing? I'm don't really understand the errors + it was not failing before I sqashed 🤔

@ktock
Copy link
Collaborator

ktock commented Jul 18, 2025

TestGetRemotes failure doesn't look like caused by this change (posted a possible fix #6099). I'm rerunning the test.

https://github.com/moby/buildkit/actions/runs/16344069969/job/46175402958?pr=6092#step:8:4328

=== FAIL: cache TestGetRemotes (1.05s)
    manager_test.go:1783: 
        	Error Trace:	/src/cache/manager_test.go:1783
        	            				/src/vendor/golang.org/x/sync/errgroup/errgroup.go:130
        	            				/usr/local/go/src/runtime/asm_amd64.s:1700
        	Error:      	An error is expected but got nil.
        	Test:       	TestGetRemotes

@Fricounet
Copy link
Contributor Author

Another failure but it also looks unrelated

Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I look at the implementation of walkBlob it is calling walkBlobVariantsOnly with visited map to avoid duplications. Isn't that one working correctly, and if it is not then probably should fix it there.

@Fricounet
Copy link
Contributor Author

@tonistiigi it doesn't work as expected because:
in a case with 2 compressions (gzip and zstd) where the gzip image was built first, then the zstd one and you decide to rebuild + export cache for the gzip image:

  • f is called once in walkBlob for the gzip layer
  • on the first iteration of walkBlobVariantOnly, f is called once for the zstd layer here
  • then walkBlobVariantOnly is called again with the zstd layer (which will have the gzip layer in its content store info because it was built after the initial gzip image) which means that f will be called a third time for the gzip layer there because the visited check only occur after

I thought about fixing by moving the visited check to about here however this would mean that walkBlobVariantOnly never calls f for the inital blob which is a breaking change and It means we need to consider the consequences for all the cases where f is different when calling walkBlobVariantOnly and if it has, or not, side effects.
So I felt that just making sure that there can't be duplicates in the end result of the function used to call walkBlob (by using a map) is safer as it means we don't have to care about how many times the function gets called.
What do you think?

@Fricounet
Copy link
Contributor Author

@tonistiigi do you mind giving me your updated opinion based on my message above?

cache/remote.go Outdated
@@ -112,9 +112,11 @@ func getAvailableBlobs(ctx context.Context, cs content.Store, chain *solver.Remo
if err != nil {
return nil, err
}
var descs []ocispecs.Descriptor
var descs = map[digest.Digest]ocispecs.Descriptor{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current patch seems to make output order potentially undeterministic. Could this be a problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonistiigi that's a good point. I tried to track all the usages of descs in the calling functions but it branches out fast and I can't guarantee that nondeterministic ordering won't ever be an issue.
I've modified the code to revert to using a slice and do a basic for loop on it to ensure deduplication (which is think is fine performance wise, see updated comment in 622757e)
What do you think?

@Fricounet Fricounet force-pushed the fricounet/upstream/fix-digest-duplication branch from 332db0f to 622757e Compare August 11, 2025 08:56
cache/remote.go Outdated
// Looping over the list is preferable:
// 1. to avoid using a map, which don't preserve the order of descriptors,
// 2. descs will have a length the number of compression variants for a blob, which is usually very small
for _, d := range descs {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be done with https://pkg.go.dev/slices#ContainsFunc in a slightly cleaner way in the new Go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 7526d59

@Fricounet Fricounet force-pushed the fricounet/upstream/fix-digest-duplication branch from 594e9ce to 7526d59 Compare August 11, 2025 12:51
Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please squash commits as well

The walkBlob function was called with a function having a side effect: appending a descriptor to a list. However, the function could be called multiple times for the same descriptor (once in `walkBlob` and once in `walkBlobVariantsOnly` in that case) leading to duplicate entries in the final list.
Instead, we will check each time we want to add an element, that it's not already in the list using slice.ContainsFunc.
This should be fine from a performance perspective as the list is small (at most 2 compressions in the vast majority of cases).

Signed-off-by: Baptiste Girard-Carrabin <[email protected]>
@Fricounet Fricounet force-pushed the fricounet/upstream/fix-digest-duplication branch from 7526d59 to 786d74c Compare August 11, 2025 14:36
@Fricounet
Copy link
Contributor Author

commits squashed @tonistiigi

@Fricounet
Copy link
Contributor Author

tests seem to have irrelevant failures again 😓 can they be retried?

@tonistiigi tonistiigi merged commit cea1cad into moby:master Aug 12, 2025
167 of 169 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants