Vectorize `unique` #5092

AlexGuteniev · 2024-11-16T20:59:33Z

Not really unique, modelled on #4987

⏬ Double load

To compare adjacent values, the same memory is loaded twice with an element shift.

It is possible to reuse the previous vector part, and mix it with the current, to save one load, but have some extra instructions to mix values, and a loop-carried dependency. On SSE path it is possible with _mm_alignr_epi8 (except for 8-bit elements). For AVX it would be way more complex due to AVX lanes.

Benchmarking shows that double load is faster than any reuse attempt. To some extent such a result overlaps with #4958

⏱️ Benchmark results

Benchmark	main	this
u<alg_type::std_fn, std::uint8_t>	1166 ns	190 ns
u<alg_type::std_fn, std::uint16_t>	1222 ns	247 ns
u<alg_type::std_fn, std::uint32_t>	1555 ns	310 ns
u<alg_type::std_fn, std::uint64_t>	1470 ns	665 ns
u<alg_type::rng, std::uint8_t>	1230 ns	187 ns
u<alg_type::rng, std::uint16_t>	1204 ns	233 ns
u<alg_type::rng, std::uint32_t>	1268 ns	308 ns
u<alg_type::rng, std::uint64_t>	1505 ns	665 ns

stl/inc/algorithm

Less error prone, especially if implementing _copy someday

benchmarks/src/unique.cpp

tests/std/tests/VSO_0000000_vector_algorithms/test.cpp

stl/src/vector_algorithms.cpp

StephanTLavavej · 2025-03-04T23:12:08Z

The speedups are less significant on my 5950X, but good across the board with no regressions:

Benchmark	Before	After	Speedup
`u<alg_type::std_fn, std::uint8_t>`	1681 ns	494 ns	3.40
`u<alg_type::std_fn, std::uint16_t>`	1566 ns	503 ns	3.11
`u<alg_type::std_fn, std::uint32_t>`	1319 ns	657 ns	2.01
`u<alg_type::std_fn, std::uint64_t>`	1466 ns	1138 ns	1.29
`u<alg_type::rng, std::uint8_t>`	1261 ns	497 ns	2.54
`u<alg_type::rng, std::uint16_t>`	1691 ns	584 ns	2.90
`u<alg_type::rng, std::uint32_t>`	1765 ns	601 ns	2.94
`u<alg_type::rng, std::uint64_t>`	1403 ns	1139 ns	1.23

StephanTLavavej · 2025-03-21T14:18:31Z

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

AlexGuteniev · 2025-03-23T20:26:30Z

I've discovered that something is missing.
But I think it can wait for a follow up PR.

Unlike remove algorithm, this one doesn't have the search for the first duplicate before the main vectorization loop. The scalar implementations in headers have that part, the vectorized one currently doesn't.

For performance it is clearly a missed opportunity. Though the vectorization improvement should be bigger than the negative effect of extra writes.

For correctness, I'm not sure. [algorithms.requirements]/3 says:

For purposes of determining the existence of data races, algorithms shall not modify objects referenced through an iterator argument unless the specification requires such modification.

However as the writes write equal integer values, this is not observable for concurrent reads, even if container violates alignment requirements and the write is not atomic.

The only thing where extra writes can be observable is running this algorithm on a read-only data without adjacent duplicates. But this is a very silly use case.

It is easily fixable with adjacent_find, vectorized in another PR here.

StephanTLavavej · 2025-03-24T23:32:52Z

Thanks for the one-of-a-kind PR! 😹 🚀 🎉

unique vectorization

cffb1e7

AlexGuteniev requested a review from a team as a code owner November 16, 2024 20:59

StephanTLavavej added the performance Must go faster label Nov 16, 2024

StephanTLavavej self-assigned this Nov 16, 2024

StephanTLavavej reviewed Nov 16, 2024

View reviewed changes

stl/inc/algorithm Outdated Show resolved Hide resolved

stl/inc/algorithm Outdated Show resolved Hide resolved

AlexGuteniev added 3 commits November 16, 2024 23:18

no point

a0b714d

Not unique problem

cccf693

Pointed out coverage

54781db

AlexGuteniev requested a review from StephanTLavavej November 16, 2024 21:42

Deduplicate

fa4ff20

Less error prone, especially if implementing _copy someday

This comment was marked as resolved.

Sign in to view

StephanTLavavej changed the title ~~unique vectorization~~ Vectorize unique Dec 17, 2024

StephanTLavavej assigned davidmrdavid Feb 19, 2025

AlexGuteniev added 2 commits February 26, 2025 21:37

Mention unique shuffling requirement

407897e

whitespace

4ca596b

StephanTLavavej unassigned davidmrdavid Feb 27, 2025

StephanTLavavej added 4 commits March 3, 2025 14:18

Merge branch 'main' into unique

54b2938

Include <type_traits> for conditional_t.

a8b1f3b

Direct-init vector instead of calling resize().

144163b

Drop std::.

a2bb838

StephanTLavavej requested changes Mar 3, 2025

View reviewed changes

github-project-automation bot moved this from Initial Review to Work In Progress in STL Code Reviews Mar 3, 2025

This comment was marked as resolved.

Sign in to view

StephanTLavavej removed their assignment Mar 3, 2025

fix types in pointer test

5ea3d5a

AlexGuteniev requested a review from StephanTLavavej March 4, 2025 07:28

AlexGuteniev commented Mar 4, 2025

View reviewed changes

tests/std/tests/VSO_0000000_vector_algorithms/test.cpp Outdated Show resolved Hide resolved

simplify unique pointer test

c53a430

StephanTLavavej moved this from Work In Progress to Initial Review in STL Code Reviews Mar 4, 2025

StephanTLavavej self-assigned this Mar 4, 2025

StephanTLavavej added 6 commits March 4, 2025 13:20

<memory> is no longer used.

b789701

Drop repeated TD alias.

2f51ed2

Value-init ptr_val_array.

8eb10d7

When is_pointer_v<T>, dis(gen) returns int.

2e9f007

Mark _Unique_fallback as noexcept.

e35ed5f

Avoid abbreviated function templates.

8928190

StephanTLavavej reviewed Mar 4, 2025

View reviewed changes

StephanTLavavej approved these changes Mar 4, 2025

View reviewed changes

StephanTLavavej mentioned this pull request Mar 4, 2025

Maintainer priorities #4700

Open

StephanTLavavej removed their assignment Mar 4, 2025

StephanTLavavej moved this from Initial Review to Ready To Merge in STL Code Reviews Mar 4, 2025

AlexGuteniev mentioned this pull request Mar 11, 2025

Vectorize adjacent_find #5331

Merged

StephanTLavavej moved this from Ready To Merge to Merging in STL Code Reviews Mar 21, 2025

StephanTLavavej self-assigned this Mar 21, 2025

StephanTLavavej added a commit to StephanTLavavej/STL that referenced this pull request Mar 21, 2025

microsoftGH-5092

d4cc80f

This was referenced Mar 22, 2025

Vectorize search_n for small values of n #5352

Merged

Vectorize remove_copy and unique_copy #5355

Merged

StephanTLavavej merged commit 2b11023 into microsoft:main Mar 24, 2025
39 checks passed

github-project-automation bot moved this from Merging to Done in STL Code Reviews Mar 24, 2025

AlexGuteniev deleted the unique branch March 25, 2025 05:25

AlexGuteniev mentioned this pull request Mar 25, 2025

Search for first duplicate in vectorized unique #5363

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vectorize `unique` #5092

Vectorize `unique` #5092

AlexGuteniev commented Nov 16, 2024

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

StephanTLavavej commented Mar 4, 2025

Uh oh!

StephanTLavavej commented Mar 21, 2025

Uh oh!

AlexGuteniev commented Mar 23, 2025 •

edited by StephanTLavavej

Loading

Uh oh!

Uh oh!

StephanTLavavej commented Mar 24, 2025

Uh oh!

Uh oh!

Vectorize unique #5092

Vectorize unique #5092

Conversation

AlexGuteniev commented Nov 16, 2024

⏬ Double load

⏱️ Benchmark results

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

StephanTLavavej commented Mar 4, 2025

Uh oh!

StephanTLavavej commented Mar 21, 2025

Uh oh!

AlexGuteniev commented Mar 23, 2025 • edited by StephanTLavavej Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

StephanTLavavej commented Mar 24, 2025

Uh oh!

Uh oh!

Vectorize `unique` #5092

Vectorize `unique` #5092

AlexGuteniev commented Mar 23, 2025 •

edited by StephanTLavavej

Loading