Skip to content

Conversation

snickolls-arm
Copy link
Contributor

  • Improves random data coverage for these tests
  • Simplifies Vector pinning
  • Fixes some errors in some mask-related helpers when inputs have irregular mask data
  • Fixes a saturation error in a shift helper
  • Fixes overflow error in FusedAddRoundedHalving helper
  • Bias generator to produce zeroes for masks 50% of the time
  • Suppress generation of NaN values

* Improves random data coverage for these tests
* Simplifies Vector pinning
* Fixes some errors in some mask-related helpers when inputs have irregular mask data
* Fixes a saturation error in a shift helper
* Fixes overflow error in FusedAddRoundedHalving helper
* Bias generator to produce zeroes for masks 50% of the time
* Suppress generation of NaN values
@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 14, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Aug 14, 2025
@snickolls-arm
Copy link
Contributor Author

@dotnet/arm64-contrib

@snickolls-arm
Copy link
Contributor Author

It looks like there are more failures related to the increased coverage that I didn't see when I ran the tests, so I'll see if I can fix these.

@vcsjones vcsjones added area-System.Runtime.Intrinsics and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Aug 15, 2025
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

@snickolls-arm
Copy link
Contributor Author

The errors on Sve2.(Min|Max)NumberPairwise look like a codegen bug:

z0 -> mask
z1 -> left
z2 -> right

0xffffabfe5c60:      ptrue   p0.s
0xffffabfe5c64:      cmpne   p0.s, p0/z, z0.s, #0
...
0xffffabfe5c6c:      movprfx z3.s, p0/z, z1.s
0xffffabfe5c70:      fminnmp z3.s, p0/m, z3.s, z2.s

The movprfx instruction is causing the left input to be masked before the operation happens. This isn't valid if the operation is a pairwise operation, because pairwise operations are operating on interleaved pairs of values e.g. min(left[0], left[1]), min(right[0], right[1]), min(left[2], left[3]), .... The mask applies to the result of the operation on each pair, so masking one of the inputs before the operation like this leads to a wrong result.

I think the fix would be to generate merging or unpredicated movprfx in this scenario. With a preference for the latter as it is supposed to have a performance advantage.

@a74nh FYI

@a74nh
Copy link
Contributor

a74nh commented Aug 15, 2025

I think the fix would be to generate merging or unpredicated movprfx in this scenario. With a preference for the latter as it is supposed to have a performance advantage.

Agreed with unpredicted movprfx too.

Ideally we should push this PR for NET10 too as it's a bug in the implementation.

@JulieLeeMSFT JulieLeeMSFT added this to the 10.0.0 milestone Aug 18, 2025
@jeffhandley
Copy link
Member

@JulieLeeMSFT I consulted with @tannergooding on this and we're inclined to move this to 11.0.0 and only accept into main without backporting to 10.0.0. The value here is improving tests and not addressing functional correctness in the product code. I'm changing the milestone to reflect that, but please chime in if you see value in backporting to 10. Thanks.

@jeffhandley jeffhandley modified the milestones: 10.0.0, 11.0.0 Sep 4, 2025
@a74nh
Copy link
Contributor

a74nh commented Sep 5, 2025

@JulieLeeMSFT I consulted with @tannergooding on this and we're inclined to move this to 11.0.0 and only accept into main without backporting to 10.0.0. The value here is improving tests and not addressing functional correctness in the product code. I'm changing the milestone to reflect that, but please chime in if you see value in backporting to 10. Thanks.

My only concern is there is a bug fix in here for SVE2 MaxNumberPairwise and MinNumberPairwise. Movprfx is being used in an unpredicatable way, so the API can return wrong results. It wasn't spotted in testing because the mask generation was limited.

How about putting the two .cpp Jit changes in a separate PR and pushing just that for .NET10 ? It's a small change.

@snickolls-arm
Copy link
Contributor Author

@JulieLeeMSFT I consulted with @tannergooding on this and we're inclined to move this to 11.0.0 and only accept into main without backporting to 10.0.0. The value here is improving tests and not addressing functional correctness in the product code. I'm changing the milestone to reflect that, but please chime in if you see value in backporting to 10. Thanks.

My only concern is there is a bug fix in here for SVE2 MaxNumberPairwise and MinNumberPairwise. Movprfx is being used in an unpredicatable way, so the API can return wrong results. It wasn't spotted in testing because the mask generation was limited.

How about putting the two .cpp Jit changes in a separate PR and pushing just that for .NET10 ? It's a small change.

I've added this PR here: #119390

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Runtime.Intrinsics community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants