Skip to content

Conversation

omera-nv
Copy link
Collaborator

@omera-nv omera-nv commented Jul 8, 2025

Description

This PR adds several checks that ensures trtllm-bench does not hang indefinitely if an exception is thrown during inference:

  • asyncio tasks are checked for errors when done, and the manager is stopped if an error occurs.
  • Number of perf items is checked against the number of requests submitted.
  • CancelErrors are caught for a cleaner traceback.
  • Added an assert in py_executor.py to check for sampling failures.

omera-nv added 3 commits July 8, 2025 13:40
Signed-off-by: Omer Ullman Argov <[email protected]>
Signed-off-by: Omer Ullman Argov <[email protected]>
Signed-off-by: Omer Ullman Argov <[email protected]>
@omera-nv omera-nv requested review from a team as code owners July 8, 2025 10:51
@omera-nv omera-nv requested a review from achartier July 8, 2025 10:51
@omera-nv
Copy link
Collaborator Author

omera-nv commented Jul 8, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11287 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11287 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8347 completed with status: 'SUCCESS'

@omera-nv omera-nv merged commit d6d2ab2 into NVIDIA:main Jul 9, 2025
3 checks passed
zhou-yuxin pushed a commit to zhou-yuxin/TensorRT-LLM that referenced this pull request Jul 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants