Skip to content

Conversation

AlexGuteniev
Copy link
Contributor

The approach is similar to find vectorization, except for shifted load, which is similar to unique in #5092, Nothing novel 🥱

This time I haven't tried to avoid double load, I trust past experience that it is faster than blending with previous value, or using not the whole vector as a step.

I calculated Speedup as Before divided by After using a spreadsheet software.

Benchmark Before After Speedup
bm<AlgType::Std. char>/2525/1142 317 ns 17.2 ns 18.43
bm<AlgType::Std. short>/2525/1142 295 ns 49.6 ns 5.95
bm<AlgType::Std. int>/2525/1142 285 ns 88.2 ns 3.23
bm<AlgType::Std. long long>/2525/1142 284 ns 161 ns 1.76
bm<AlgType::Rng. char>/2525/1142 282 ns 20.4 ns 13.82
bm<AlgType::Rng. short>/2525/1142 283 ns 47.1 ns 6.01
bm<AlgType::Rng. int>/2525/1142 280 ns 82.3 ns 3.40
bm<AlgType::Rng. long long>/2525/1142 289 ns 142 ns 2.04

@AlexGuteniev AlexGuteniev requested a review from a team as a code owner March 11, 2025 19:42
@github-project-automation github-project-automation bot moved this to Initial Review in STL Code Reviews Mar 11, 2025
@StephanTLavavej StephanTLavavej added the performance Must go faster label Mar 11, 2025
@StephanTLavavej StephanTLavavej self-assigned this Mar 11, 2025
@StephanTLavavej
Copy link
Member

Thanks! 😸 I pushed minor changes and one performance fix to activate the optimization properly in C++14/17 mode.

@StephanTLavavej StephanTLavavej removed their assignment Mar 17, 2025
@StephanTLavavej StephanTLavavej moved this from Initial Review to Ready To Merge in STL Code Reviews Mar 17, 2025
@StephanTLavavej StephanTLavavej moved this from Ready To Merge to Merging in STL Code Reviews Mar 21, 2025
@StephanTLavavej StephanTLavavej self-assigned this Mar 21, 2025
@StephanTLavavej
Copy link
Member

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

StephanTLavavej added a commit to StephanTLavavej/STL that referenced this pull request Mar 21, 2025
@StephanTLavavej
Copy link
Member

I had to push an additional commit because we forgot the _M_ARM64EC guards again.

@StephanTLavavej StephanTLavavej merged commit 5afb032 into microsoft:main Mar 24, 2025
39 checks passed
@github-project-automation github-project-automation bot moved this from Merging to Done in STL Code Reviews Mar 24, 2025
@StephanTLavavej
Copy link
Member

Thanks for another vectorization PR! 😻 💚 🔍

@AlexGuteniev AlexGuteniev deleted the adjacent_find branch March 25, 2025 05:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants