-
-
Notifications
You must be signed in to change notification settings - Fork 10.4k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
🐛 Describe the bug
Not possible to spin up LoRa adapters with the latest mistral models.
pip list | grep vllm
vllm 0.10.1.1
vllm serve mistralai/Mistral-Small-3.2-24B-Instruct-2506 \
--tokenizer_mode mistral \
--config_format mistral \
--load_format mistral \
--tool-call-parser mistral \
--enable-auto-tool-choice \
--limit-mm-per-prompt '{"image":10}' \
--tensor-parallel-size 2 \
--max-model-len 2048 \
--lora-modules 'myft=/home/tim/lora_adapter'
(VllmWorker TP0 pid=1402898) ERROR 09-11 17:45:52 [multiproc_executor.py:596] WorkerProc hit an exception.
(VllmWorker TP0 pid=1402898) ERROR 09-11 17:45:52 [multiproc_executor.py:596] Traceback (most recent call last):
(VllmWorker TP0 pid=1402898) ERROR 09-11 17:45:52 [multiproc_executor.py:596] File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/v1/executor/multiproc_executor.py", line 591, in worker_busy_loop
(VllmWorker TP0 pid=1402898) ERROR 09-11 17:45:52 [multiproc_executor.py:596] output = func(*args, **kwargs)
(VllmWorker TP0 pid=1402898) ERROR 09-11 17:45:52 [multiproc_executor.py:596] File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/v1/worker/gpu_worker.py", line 403, in add_lora
(VllmWorker TP0 pid=1402898) ERROR 09-11 17:45:52 [multiproc_executor.py:596] return self.model_runner.add_lora(lora_request)
(VllmWorker TP0 pid=1402898) ERROR 09-11 17:45:52 [multiproc_executor.py:596] File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/v1/worker/lora_model_runner_mixin.py", line 160, in add_lora
(VllmWorker TP0 pid=1402898) ERROR 09-11 17:45:52 [multiproc_executor.py:596] if not self.lora_manager:
(VllmWorker TP0 pid=1402898) ERROR 09-11 17:45:52 [multiproc_executor.py:596] AttributeError: 'GPUModelRunner' object has no attribute 'lora_manager'
(VllmWorker TP1 pid=1402899) ERROR 09-11 17:45:52 [multiproc_executor.py:596] WorkerProc hit an exception.
(VllmWorker TP1 pid=1402899) ERROR 09-11 17:45:52 [multiproc_executor.py:596] Traceback (most recent call last):
(VllmWorker TP1 pid=1402899) ERROR 09-11 17:45:52 [multiproc_executor.py:596] File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/v1/executor/multiproc_executor.py", line 591, in worker_busy_loop
(VllmWorker TP1 pid=1402899) ERROR 09-11 17:45:52 [multiproc_executor.py:596] output = func(*args, **kwargs)
(VllmWorker TP1 pid=1402899) ERROR 09-11 17:45:52 [multiproc_executor.py:596] File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/v1/worker/gpu_worker.py", line 403, in add_lora
(VllmWorker TP1 pid=1402899) ERROR 09-11 17:45:52 [multiproc_executor.py:596] return self.model_runner.add_lora(lora_request)
(VllmWorker TP1 pid=1402899) ERROR 09-11 17:45:52 [multiproc_executor.py:596] File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/v1/worker/lora_model_runner_mixin.py", line 160, in add_lora
(VllmWorker TP1 pid=1402899) ERROR 09-11 17:45:52 [multiproc_executor.py:596] if not self.lora_manager:
(VllmWorker TP1 pid=1402899) ERROR 09-11 17:45:52 [multiproc_executor.py:596] AttributeError: 'GPUModelRunner' object has no attribute 'lora_manager'
(EngineCore_0 pid=1402751) ERROR 09-11 17:45:52 [core.py:769] Invocation of add_lora method failed
(EngineCore_0 pid=1402751) ERROR 09-11 17:45:52 [core.py:769] Traceback (most recent call last):
(EngineCore_0 pid=1402751) ERROR 09-11 17:45:52 [core.py:769] File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 766, in _handle_client_request
(EngineCore_0 pid=1402751) ERROR 09-11 17:45:52 [core.py:769] result = method(*self._convert_msgspec_args(method, args))
(EngineCore_0 pid=1402751) ERROR 09-11 17:45:52 [core.py:769] File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 383, in add_lora
(EngineCore_0 pid=1402751) ERROR 09-11 17:45:52 [core.py:769] return self.model_executor.add_lora(lora_request)
(EngineCore_0 pid=1402751) ERROR 09-11 17:45:52 [core.py:769] File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/executor/executor_base.py", line 157, in add_lora
(EngineCore_0 pid=1402751) ERROR 09-11 17:45:52 [core.py:769] return all(self.collective_rpc("add_lora", args=(lora_request, )))
(EngineCore_0 pid=1402751) ERROR 09-11 17:45:52 [core.py:769] File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/v1/executor/multiproc_executor.py", line 243, in collective_rpc
(EngineCore_0 pid=1402751) ERROR 09-11 17:45:52 [core.py:769] result = get_response(w, dequeue_timeout)
(EngineCore_0 pid=1402751) ERROR 09-11 17:45:52 [core.py:769] File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/v1/executor/multiproc_executor.py", line 230, in get_response
(EngineCore_0 pid=1402751) ERROR 09-11 17:45:52 [core.py:769] raise RuntimeError(
(EngineCore_0 pid=1402751) ERROR 09-11 17:45:52 [core.py:769] RuntimeError: Worker failed with error ''GPUModelRunner' object has no attribute 'lora_manager'', please check the stack trace above for the root cause
(VllmWorker TP1 pid=1402899) INFO 09-11 17:45:52 [multiproc_executor.py:520] Parent process exited, terminating worker
(VllmWorker TP0 pid=1402898) INFO 09-11 17:45:52 [multiproc_executor.py:520] Parent process exited, terminating worker
(APIServer pid=1402460) Traceback (most recent call last):
(APIServer pid=1402460) File "/home/tim/anaconda3/envs/torch/bin/vllm", line 7, in <module>
(APIServer pid=1402460) sys.exit(main())
(APIServer pid=1402460) File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/entrypoints/cli/main.py", line 54, in main
(APIServer pid=1402460) args.dispatch_function(args)
(APIServer pid=1402460) File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/entrypoints/cli/serve.py", line 50, in cmd
(APIServer pid=1402460) uvloop.run(run_server(args))
(APIServer pid=1402460) File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/uvloop/__init__.py", line 82, in run
(APIServer pid=1402460) return loop.run_until_complete(wrapper())
(APIServer pid=1402460) File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=1402460) File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/uvloop/__init__.py", line 61, in wrapper
(APIServer pid=1402460) return await main
(APIServer pid=1402460) File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 1850, in run_server
(APIServer pid=1402460) await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=1402460) File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 1878, in run_server_worker
(APIServer pid=1402460) await init_app_state(engine_client, vllm_config, app.state, args)
(APIServer pid=1402460) File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 1668, in init_app_state
(APIServer pid=1402460) await state.openai_serving_models.init_static_loras()
(APIServer pid=1402460) File "/home/tim/anaconda3/envs/torch/lib/python3.10/site-packages/vllm/entrypoints/openai/serving_models.py", line 85, in init_static_loras
(APIServer pid=1402460) raise ValueError(load_result.error.message)
(APIServer pid=1402460) ValueError: Call to add_lora method failed: Worker failed with error ''GPUModelRunner' object has no attribute 'lora_manager'', please check the stack trace above for the root cause
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
ChenXiaoTemp and lynnliu030
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working