Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

simplify duplicative device check for cuda
#24793 opened Sep 13, 2025 by ziliangpeng Loading…
[Docs] Fix warnings in mkdocs build (continued) deepseek Related to DeepSeek models llama Related to Llama models qwen Related to Qwen models
#24791 opened Sep 13, 2025 by Zerohertz Loading…
[Chore] Remove unused batched RoPE op & kernel ready ONLY add when PR is ready to merge/full CI is needed
#24789 opened Sep 13, 2025 by WoosukKwon Loading…
[gpt-oss] streaming add item id, content id frontend gpt-oss Related to GPT-OSS models
#24788 opened Sep 12, 2025 by qandrew Loading…
5 tasks
D82005826: [vllm][gptoss] pass toolcall turn to kv cache mgr frontend gpt-oss Related to GPT-OSS models v1
#24787 opened Sep 12, 2025 by kingsmad Loading…
1 of 5 tasks
[KVConnector][P/D] Add kv connector metrics v1
#24786 opened Sep 12, 2025 by lacora Loading…
3 of 5 tasks
[Doc]: Remove 404 hyperlinks documentation Improvements or additions to documentation
#24785 opened Sep 12, 2025 by rozeappletree Loading…
[Perf] Fix DeepGEMM Contiguous Layout Issue, 5.5% Throughput Improvement deepseek Related to DeepSeek models performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed
#24783 opened Sep 12, 2025 by yewentao256 Loading…
[Bugfix] Dtype error with sequence classification model and lora.
#24775 opened Sep 12, 2025 by pb-sameerreddy Loading…
1 of 5 tasks
[Bug] Fix is_flashmla_supported Check Error v1
#24774 opened Sep 12, 2025 by yewentao256 Loading…
[Core][Multimodal] Cache supports_kw
#24773 opened Sep 12, 2025 by lgeiger Loading…
[CI][Spec Decode] Adjust threshold for flaky ngram spec decoding test again ready ONLY add when PR is ready to merge/full CI is needed v1
#24771 opened Sep 12, 2025 by wwl2755 Loading…
[benchmark] Add triton version in the moe tuned config performance Performance-related issues
#24769 opened Sep 12, 2025 by jeejeelee Loading…
5 tasks
[Bugfix] Update import path for bc_linter_include v1
#24766 opened Sep 12, 2025 by mmangkad Loading…
5 tasks
[Misc] Correct an outdated comment. v1
#24765 opened Sep 12, 2025 by russellb Loading…
[Bugfix] canot clean s3 model tmp cache dir
#24764 opened Sep 12, 2025 by lengrongfu Loading…
5 tasks
[Bugfix] Fix GPUModelRunner has no attribute lora_manager ready ONLY add when PR is ready to merge/full CI is needed v1
#24762 opened Sep 12, 2025 by jeejeelee Loading…
5 tasks
[UX] Enforce valid choices for envs like VLLM_ATTENTION_BACKEND, etc
#24761 opened Sep 12, 2025 by mgoin Loading…
5 tasks
[gpt-oss][1a] create_responses stream outputs BaseModel type, api server is SSE still frontend gpt-oss Related to GPT-OSS models
#24759 opened Sep 12, 2025 by qandrew Loading…
5 tasks
[Perf] Use NVIDIA hardware-accelerated instruction for float to fp8_e4m3 quantization performance Performance-related issues quantization ready ONLY add when PR is ready to merge/full CI is needed
#24757 opened Sep 12, 2025 by elvischenv Loading…
5 tasks
[CI Failure] Fix test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe ci-failure Issue about an unexpected test failure in CI ready ONLY add when PR is ready to merge/full CI is needed
#24750 opened Sep 12, 2025 by mgoin Loading…
5 tasks
Remove source compilation of nixl dependency ci/build documentation Improvements or additions to documentation
#24749 opened Sep 12, 2025 by bbartels Loading…
5 tasks
ProTip! Type g i on any issue or pull request to go back to the issue listing page.