-
Notifications
You must be signed in to change notification settings - Fork 1.6k
std::atomic<std::shared_ptr>::wait
should compare control blocks
#3655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
Co-authored-by: Alex Guteniev <[email protected]>
This comment was marked as resolved.
This comment was marked as resolved.
Resolved trivial adjacent-edit merge conflict in tests/std/include/test_atomic_wait.hpp.
After hiding from this PR for a year 😹, I've finally looked at it and I believe I'm comfortable with the change (after pushing a fix/test for "owning the null pointer"). I'm definitely happy that @AlexGuteniev's suggested test case has been added, I don't think that this poses significant regression risks, and I don't think that there's any vNext interaction (this is just a behavioral change; mix-and-match scenarios shouldn't be harmed beyond maybe not getting the fix). Edit: I've been convinced on Discord that we are, in fact, doomed. ☠️ |
I'm afraid the PR doesn't fully fix hang in any of the scenario - it only makes hangs unlikely, but still possible, thus introducing inconsistency. |
I don't want to provide an example anymore as there's a consensus that the problem is very infrequently reproducible, yet existent. @StephanTLavavej suggested that we can have timeout for correctness, @BillyONeal suggested that exponential back-off is an acceptable implementation of atomic wait. We can try our best and make atomic wait with exponential timeout. This is not abi-breaking, as we still wait on the same address. I think that the PR should be closed with this PR with exponential back-out added. Another issue can be open to consider something better for |
I'm speculatively mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed. |
for (;;) { | ||
auto _Rep = _Repptr._Lock_and_load(); | ||
bool _Equal = _Ptr.load(memory_order_relaxed) == _Old; | ||
bool _Equal = _Ptr.load(memory_order_relaxed) == _Old_ptr && _Rep == _Old_rep; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No change requested; we should do this in a followup if at all to avoid resetting testing, since it's not a correctness issue. Should this be reordered as:
bool _Equal = _Ptr.load(memory_order_relaxed) == _Old_ptr && _Rep == _Old_rep; | |
bool _Equal = _Rep == _Old_rep && _Ptr.load(memory_order_relaxed) == _Old_ptr; |
so we short-circuit when the core-local _Rep == _Old_rep
is false
before potentially loading _Ptr
's cache line from some other core's data cache? (I suspect I understand cache coherence protocols just well enough to be dangerous.) Would any difference just be noise compared to the expense of _Repptr._Lock_and_load()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Pulling in @AlexGuteniev to render an opinion and/or tell me how wrong I am 😄.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vague understanding is that a relaxed load has no barriers, so it has no special costs. It would be fine to reorder, though, since the .load
is at least a debug mode function call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't _Repptr and _Ptr in the same cache line? Sure, whatever they point to can be anywhere, but the pointers themselves are (as far as I can see) right beside each other.
(STL's comment sounds accurate, though.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that they are always in the same cache line (it is really always, not just most of the time, because we over align to alignas(2 * sizeof(void*))
). So from the cache perspective it does not matter.
I agree also that eliminating debug mode call can be potentially beneficial.
Still I more like the original order, because we compare _Rep == _Old_rep;
the very last thing, as close to wait as possible, so reduce the probability of resorting to timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering _Rep is just a stack-local and the real load is on the line above, I can't quite follow your reasoning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering _Rep is just a stack-local and the real load is on the line above, I can't quite follow your reasoning.
Hm, you are right, we done loading it into register at the same time anyway.
Then debug mode call saving could be the main decision factor here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I'd forgotten that we overalign - my cache coherence point is invalid. +1 to avoiding the call in debug compiles, though. We really want to minimize the time between load and check.
Thanks for fixing this runtime correctness bug - what a rollercoaster of despair and elation! 📉 📈 😹 🎉 |
Fixes #3602