Skip to content

Conversation

jgphpc
Copy link
Collaborator

@jgphpc jgphpc commented Aug 26, 2025

UENV=prgenv-gnu/25.06:rc5 ./R -c cscs-reframe-tests.git/checks/system/slurm/slurm.py -J p=normal -r
  • You can manually trigger 1 (or more) CI pipelines by adding a cscs-ci run alps-daint-uenv comment in the Pull Request
  • By default all UENVs will be tested, however you can restrict to a single UENV with: cscs-ci run alps-daint-uenv;MY_UENV=cp2k/2025.1:v2

@jgphpc jgphpc marked this pull request as draft August 26, 2025 11:17
@jgphpc jgphpc self-assigned this Aug 26, 2025
@jgphpc jgphpc requested a review from gppezzi August 26, 2025 11:20
@gppezzi gppezzi requested a review from teojgo August 26, 2025 12:33
@gppezzi
Copy link
Collaborator

gppezzi commented Aug 26, 2025

@jgphpc the tests are passing on Daint, but are you addressing the gres issue reported on SD-66684?

It is unclear, as the old test (which you are deleting in this PR) still fails and the updated one (SlurmGPUGresTest) doesn't.

@jgphpc jgphpc requested a review from ekouts August 26, 2025 13:08
@jgphpc
Copy link
Collaborator Author

jgphpc commented Aug 26, 2025

@jgphpc the tests are passing on Daint, but are you addressing the gres issue reported on SD-66684?

It is unclear, as the old test (which you are deleting in this PR) still fails and the updated one (DefaultRequestGPUSetsGRES) doesn't.

  • I did not delete gres_gpu.py, I moved it into slurm.py (no reason to keep it separate).
  • This PR is not really addressing the issue in SD-66684, it's more about sending the right data to Elastic (I updated the tests. in the file while I was reading it).

@jgphpc
Copy link
Collaborator Author

jgphpc commented Aug 26, 2025

@gppezzi
Copy link
Collaborator

gppezzi commented Aug 26, 2025

@jgphpc the tests are passing on Daint, but are you addressing the gres issue reported on SD-66684?
It is unclear, as the old test (which you are deleting in this PR) still fails and the updated one (DefaultRequestGPUSetsGRES) doesn't.

* I did not delete gres_gpu.py, I moved it into slurm.py (no reason to keep it separate).

* This PR is not really addressing the issue in SD-66684, it's more about sending the right data to Elastic (I updated the tests. in the file while I was reading it).

ok now I found the SlurmGPUGresTest (I was looking at the wrong gres test), but my question remains: all tests in this PR pass now on daint, while the old gres_gpu.py fails.

Which is odd if this PR is not addressing SD-66684, but it could be me running it wrong?

// int chunk = 33554432; // 32M
// int chunk = 67108864; // 64M
// int chunk = 134217728; // 128M
int chunk = 134217728; // 128M
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe leave a comment to document why you are changing this?

@jgphpc jgphpc changed the title Update slurm tests Update slurm tests returned perf_values Aug 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants