-
-
Notifications
You must be signed in to change notification settings - Fork 10.3k
[CI] Speed up model unit tests in CI #24253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+123
−21
Merged
Changes from all commits
Commits
Show all changes
44 commits
Select commit
Hold shift + click to select a range
f65ff24
test-pipeline
afeldman-nm d5fa076
Merge branch 'main' into speed_model_ci
afeldman-nm 35a331b
basic tests
afeldman-nm d630671
common and embedding markers
afeldman-nm 7d31d30
parallelize basic tests
afeldman-nm 6c8c6e9
adjust
afeldman-nm 3709f82
wip
afeldman-nm 0a12b97
Merge branch 'main' into speed_model_ci
afeldman-nm 183f97d
revise
afeldman-nm bba582f
burn changes
afeldman-nm b2a3d02
Merge branch 'main' into speed_model_ci
afeldman-nm 36a4875
Break out different initialization tests
afeldman-nm e32e371
shard hybrid models
afeldman-nm 1d883af
Standard language models test
afeldman-nm 774b9e4
small fix
afeldman-nm 4941627
Merge branch 'main' into speed_model_ci
afeldman-nm 1fd0c68
wip
afeldman-nm 0093c1f
typo
afeldman-nm 62e8055
Merge branch 'main' into speed_model_ci
afeldman-nm 0892a96
test
afeldman-nm a5ecbcf
test
afeldman-nm fa64ce0
test
afeldman-nm 1ac85fd
test
afeldman-nm 4f4f60c
wip
afeldman-nm 5fc5910
wip
afeldman-nm 39df6a1
wip
afeldman-nm 0b9c6e7
wip
afeldman-nm 8f58547
different approach to sharding
afeldman-nm d3713c0
merge
afeldman-nm e70c8cd
percent N
afeldman-nm 682d675
fix
afeldman-nm ee2b0dc
more sharding
afeldman-nm 4a26031
Merge branch 'main' into speed_model_ci
afeldman-nm c288ab3
isolate test; lower parallelism
afeldman-nm 6ee7661
Merge branch 'main' into speed_model_ci
afeldman-nm ff660fc
Merge branch 'main' into speed_model_ci
afeldman-nm e66b7e6
lower shard factor and reorganize
afeldman-nm 28239d4
explanatory comments
afeldman-nm 009458f
wip
afeldman-nm 63015ab
Merge branch 'main' into speed_model_ci
afeldman-nm aef7215
test names
afeldman-nm 09e84a4
Merge branch 'main' into speed_model_ci
afeldman-nm bb15907
Merge branch 'main' into speed_model_ci
afeldman-nm a806634
Merge branch 'main' into speed_model_ci
afeldman-nm File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -571,44 +571,98 @@ steps: | |
|
||
##### models test ##### | ||
|
||
- label: Basic Models Test # 57min | ||
timeout_in_minutes: 75 | ||
- label: Basic Models Tests (Initialization) | ||
timeout_in_minutes: 45 | ||
mirror_hardwares: [amdexperimental] | ||
torch_nightly: true | ||
source_file_dependencies: | ||
- vllm/ | ||
- tests/models | ||
- tests/models/test_initialization.py | ||
commands: | ||
- pytest -v -s models/test_transformers.py | ||
- pytest -v -s models/test_registry.py | ||
- pytest -v -s models/test_utils.py | ||
- pytest -v -s models/test_vision.py | ||
- pytest -v -s models/test_initialization.py | ||
# Run a subset of model initialization tests | ||
- pytest -v -s models/test_initialization.py::test_can_initialize_small_subset | ||
|
||
- label: Language Models Test (Standard) # 35min | ||
- label: Basic Models Tests (Extra Initialization) %N | ||
timeout_in_minutes: 45 | ||
mirror_hardwares: [amdexperimental] | ||
torch_nightly: true | ||
source_file_dependencies: | ||
- vllm/model_executor/models/ | ||
- tests/models/test_initialization.py | ||
commands: | ||
# Only when vLLM model source is modified - test initialization of a large | ||
# subset of supported models (the complement of the small subset in the above | ||
# test.) Also run if model initialization test file is modified | ||
- pytest -v -s models/test_initialization.py \ | ||
-k 'not test_can_initialize_small_subset' \ | ||
--num-shards=$$BUILDKITE_PARALLEL_JOB_COUNT \ | ||
--shard-id=$$BUILDKITE_PARALLEL_JOB | ||
parallelism: 2 | ||
|
||
- label: Basic Models Tests (Other) | ||
timeout_in_minutes: 45 | ||
mirror_hardwares: [amdexperimental] | ||
torch_nightly: true | ||
source_file_dependencies: | ||
- vllm/ | ||
- tests/models/test_transformers.py | ||
- tests/models/test_registry.py | ||
- tests/models/test_utils.py | ||
- tests/models/test_vision.py | ||
commands: | ||
- pytest -v -s models/test_transformers.py \ | ||
models/test_registry.py \ | ||
models/test_utils.py \ | ||
models/test_vision.py | ||
|
||
- label: Language Models Tests (Standard) | ||
timeout_in_minutes: 25 | ||
mirror_hardwares: [amdexperimental] | ||
torch_nightly: true | ||
source_file_dependencies: | ||
- vllm/ | ||
- tests/models/language | ||
commands: | ||
# Test standard language models, excluding a subset of slow tests | ||
- pip freeze | grep -E 'torch' | ||
- pytest -v -s models/language -m core_model | ||
- pytest -v -s models/language -m 'core_model and (not slow_test)' | ||
|
||
- label: Language Models Test (Hybrid) # 35 min | ||
- label: Language Models Tests (Extra Standard) %N | ||
timeout_in_minutes: 45 | ||
mirror_hardwares: [amdexperimental] | ||
torch_nightly: true | ||
source_file_dependencies: | ||
- vllm/model_executor/models/ | ||
- tests/models/language/pooling/test_embedding.py | ||
- tests/models/language/generation/test_common.py | ||
- tests/models/language/pooling/test_classification.py | ||
commands: | ||
# Shard slow subset of standard language models tests. Only run when model | ||
# source is modified, or when specified test files are modified | ||
- pip freeze | grep -E 'torch' | ||
- pytest -v -s models/language -m 'core_model and slow_test' \ | ||
--num-shards=$$BUILDKITE_PARALLEL_JOB_COUNT \ | ||
--shard-id=$$BUILDKITE_PARALLEL_JOB | ||
parallelism: 2 | ||
|
||
- label: Language Models Tests (Hybrid) %N | ||
timeout_in_minutes: 75 | ||
mirror_hardwares: [amdexperimental] | ||
torch_nightly: true | ||
source_file_dependencies: | ||
- vllm/ | ||
- tests/models/language/generation | ||
commands: | ||
# Install fast path packages for testing against transformers | ||
# Note: also needed to run plamo2 model in vLLM | ||
- uv pip install --system --no-build-isolation 'git+https://github.com/state-spaces/[email protected]' | ||
- uv pip install --system --no-build-isolation 'git+https://github.com/Dao-AILab/[email protected]' | ||
- pytest -v -s models/language/generation -m hybrid_model | ||
# Shard hybrid language model tests | ||
- pytest -v -s models/language/generation \ | ||
-m hybrid_model \ | ||
--num-shards=$$BUILDKITE_PARALLEL_JOB_COUNT \ | ||
--shard-id=$$BUILDKITE_PARALLEL_JOB | ||
parallelism: 2 | ||
|
||
- label: Language Models Test (Extended Generation) # 80min | ||
timeout_in_minutes: 110 | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.