-
-
Notifications
You must be signed in to change notification settings - Fork 10.1k
[benchmark] Add triton version in the moe tuned config #24769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Jee Jee Li <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds the Triton version to the tuned MoE configuration files. The changes involve updating the benchmark script to save the version and modifying the MoE layer to handle this new metadata when loading configurations. The version of the Triton placeholder is also updated.
My review identifies a potential robustness issue in how the configuration is loaded. The current approach of explicitly removing the triton_version
key is brittle. I've suggested a more resilient method that filters for numeric keys, which will prevent crashes if other metadata is added to the configuration files in the future.
@@ -720,7 +720,10 @@ def get_moe_configs( | |||
logger.info("Using configuration from %s for MoE layer.", | |||
config_file_path) | |||
# If a configuration has been found, return it | |||
return {int(key): val for key, val in json.load(f).items()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed change to explicitly pop triton_version
is brittle. If other metadata is added to the configuration file in the future, this code will fail with a ValueError
when trying to convert a non-integer string key to an integer.
A more robust approach is to filter for keys that are valid integers. By checking key.isdigit()
, you can automatically ignore any non-numeric keys, including triton_version
and any future metadata, making the code more resilient to future changes.
return {int(key): val for key, val in json.load(f).items()} | |
return {int(key): val for key, val in json.load(f).items() if key.isdigit()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should also be similar tuning done for fp8 gemm? I'm not super sure though but I recall the config is different. cc @mgoin
Purpose
See: #24112 (comment)
The tuned configuration will include the triton_version, like:
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.