Skip to content

Conversation

MihaZupan
Copy link
Member

@MihaZupan MihaZupan commented Jul 28, 2025

ExtractNarrowingSaturate and UnzipEven are very similar in this case, but UnzipEven is a bit cheaper (better searching throughput).

UnzipEven may lead to more false positive matches of the anchor characters (some non-ASCII chars will look like matching ASCII ones), but we'll rule those out in the verification step if we happen to hit that on all 3 chars. Since we're only using the Teddy aproach if all anchor chars are ASCII, this seems like a good tradeoff.

From Cobalt 100

Method Toolchain Mean Error Ratio
SV_Throughput Main 24.03 μs 0.005 μs 1.00
SV_Throughput PR 20.98 μs 0.002 μs 0.87
SV_ThroughputIC Main 26.05 μs 0.003 μs 1.00
SV_ThroughputIC PR 23.54 μs 0.014 μs 0.90
public class SearchValuesStringTests_Teddy
{
    private static readonly SearchValues<string> s_values = SearchValues.Create(["Sherlock", "Holmes"], StringComparison.Ordinal);
    private static readonly SearchValues<string> s_valuesIC = SearchValues.Create(["Sherlock", "Holmes"], StringComparison.OrdinalIgnoreCase);
    private static readonly string s_text_noMatches = new('a', 100_000);

    [Benchmark] public bool SV_Throughput() => s_text_noMatches.AsSpan().ContainsAny(s_values);
    [Benchmark] public bool SV_ThroughputIC() => s_text_noMatches.AsSpan().ContainsAny(s_valuesIC);
}

@MihaZupan MihaZupan added this to the 10.0.0 milestone Jul 28, 2025
@MihaZupan MihaZupan requested a review from stephentoub July 28, 2025 01:33
@MihaZupan MihaZupan self-assigned this Jul 28, 2025
@Copilot Copilot AI review requested due to automatic review settings July 28, 2025 01:33
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes the SearchValues Teddy algorithm performance on Arm64 by replacing ExtractNarrowingSaturate operations with UnzipEven for character packing. The change aims to improve throughput by using a more efficient instruction while maintaining correctness through verification steps.

  • Switches from ExtractNarrowingSaturate to UnzipEven for better performance on Arm64
  • Updates the platform requirement from general AdvSimd to AdvSimd.Arm64 specifically
  • Accepts potential false positives in exchange for better throughput, relying on verification to filter them out

Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-memory
See info in area-owners.md if you want to be subscribed.

@MihaZupan MihaZupan merged commit 12a954f into dotnet:main Jul 30, 2025
144 of 146 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Aug 30, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants