-
Notifications
You must be signed in to change notification settings - Fork 1.6k
<bit>
: Use popcount for has_single_bit()
if it's always available
#5534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
<bit>
: Use popcount for has_single_bit()
if it's always available
#5534
Conversation
@AlexGuteniev, do I need to update benchmarks, or simply put benchmark numbers as part of the commit message? |
I think the current benchmarks are fine, just post before/after number |
Ideally, we'd need ARM64 numbers as well. But I think we can do without them. |
You also need to pass somehow |
shouldn't unit tests automatically run with/without these switches? |
Unit tests should. |
Also please edit the PR description to link it with the issue using one of Github keywords. |
I'm getting conflicting results for the test: popcount on my pc is somehow slower by around 30% |
wow, accidentially, i had /Only __inline (/Ob1) option. Apparently, this destroys most of std code that doesn't have inlines. Imo, all these functions should be marked inline. |
No; I asked a while ago, see #619 |
✋ we don't do that here. See https://github.com/microsoft/STL/wiki/Benchmarking-the-STL#calculating-speedups So you probably mean 1.05 to 1.1 times speed up |
updated. I based it on my previous commit message: #5367 (actual git history doesn't have that anyways) |
Mixed results for i5-1235U (Alder Lake) P Cores
E Cores
|
…NT_INTRINSICS` defined.
Thanks! 😻 Results on my 5950X, comparing
I'm not worried about the 0.96, that looks like noise (I reran and it was slightly better), and the 1.41 speedups are very healthy indeed. |
I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed. |
no changes planned. |
0️⃣ 1️⃣ 0️⃣ |
provides 1.05 to 1.1 times speed up on x64 targets.
x64 results
fixes #5359