feat(text_embed): Add vLLM as a provider #5136

desmondcheongzx · 2025-09-04T01:46:02Z

Changes Made

Adds vLLM as a provider for text embedding.

import daft
from daft.ai.provider import load_provider
from daft.functions.ai import embed_text

provider = load_provider("vllm")
model = "Qwen/Qwen3-Embedding-0.6B"

(
    daft.read_huggingface("Open-Orca/OpenOrca")
    .with_column("embedding", embed_text(daft.col("response"), provider=provider, model=model))
    .show()
)

greptile-apps

Greptile Summary

This PR adds vLLM as a new text embedding provider to Daft's AI framework, following the established provider pattern used by existing providers like OpenAI and SentenceTransformers. The implementation includes four main components:

Provider Class (daft/ai/vllm/provider.py): Implements VLLMProvider with the standard constructor pattern and a get_text_embedder method that returns a vLLMTextEmbedderDescriptor.
Protocol Implementation (daft/ai/vllm/protocols/text_embedder.py): Contains the core embedding logic with vLLMTextEmbedderDescriptor and vLLMTextEmbedder classes. The descriptor handles model configuration and dimension detection using AutoConfig.from_pretrained(), while the embedder wraps a vLLM LLM instance for actual text embedding operations.
Provider Registry Update (daft/ai/provider.py): Adds a load_vllm() function and registers it in the PROVIDERS dictionary, enabling users to instantiate the provider via load_provider("vllm").
Test Coverage (tests/ai/test_vllm.py): Provides comprehensive unit tests using mocking to validate the provider interface and embedding functionality without requiring actual vLLM dependencies.

The implementation integrates seamlessly with Daft's existing AI infrastructure, allowing users to leverage vLLM's high-performance inference capabilities for text embedding tasks. vLLM is known for its optimized inference engine for large language models, making this addition valuable for production environments requiring fast embedding generation. The code follows established patterns for lazy dependency loading, consistent interfaces, and proper error handling with ProviderImportError when vLLM is not installed.

Confidence score: 4/5

This PR is safe to merge with only minor issues that don't affect core functionality
Score reflects solid implementation following established patterns, but with a few inconsistencies in option handling and some missing validation
Pay close attention to daft/ai/vllm/provider.py for the unused _options parameter and daft/ai/vllm/protocols/text_embedder.py for potential error handling improvements

_{5 files reviewed, 2 comments}

_{Edit Code Review Bot Settings | Greptile}

greptile-apps · 2025-09-04T07:58:08Z

daft/ai/vllm/provider.py

+
+    def __init__(self, name: str | None = None, **options: Any):
+        self._name = name if name else "vllm"
+        self._options = options


style: The _options are stored but never used in get_text_embedder. Consider passing them to the descriptor like OpenAIProvider does, or remove if truly unused.

greptile-apps · 2025-09-04T07:58:13Z

daft/ai/vllm/protocols/text_embedder.py

+    def embed_text(self, text: list[str]) -> list[Embedding]:
+        outputs = self.model.embed(text)
+        embeddings = torch.tensor([o.outputs.embedding for o in outputs])
+        return embeddings.cpu().numpy()


logic: Return type annotation says list[Embedding] but returns numpy array. Should return embeddings.cpu().numpy().tolist() or update annotation.

Suggested change

def embed_text(self, text: list[str]) -> list[Embedding]:

outputs = self.model.embed(text)

embeddings = torch.tensor([o.outputs.embedding for o in outputs])

return embeddings.cpu().numpy()

def embed_text(self, text: list[str]) -> list[Embedding]:

outputs = self.model.embed(text)

embeddings = torch.tensor([o.outputs.embedding for o in outputs])

return embeddings.cpu().numpy().tolist()

+1

The Embedding type is currently an alias to np.ndarray, but these protocols should return a list[np.ndarray].

codecov · 2025-09-04T08:49:02Z

Codecov Report

❌ Patch coverage is 87.71930% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.92%. Comparing base (81c2e1a) to head (1567b08).
⚠️ Report is 12 commits behind head on main.

Files with missing lines	Patch %	Lines
daft/ai/provider.py	16.66%	5 Missing ⚠️
daft/ai/vllm/protocols/text_embedder.py	97.22%	1 Missing ⚠️
daft/ai/vllm/provider.py	93.33%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #5136      +/-   ##
==========================================
- Coverage   76.54%   71.92%   -4.63%     
==========================================
  Files         953      955       +2     
  Lines      130653   130581      -72     
==========================================
- Hits       100006    93918    -6088     
- Misses      30647    36663    +6016

Files with missing lines	Coverage Δ
daft/ai/vllm/protocols/text_embedder.py	`97.22% <97.22%> (ø)`
daft/ai/vllm/provider.py	`93.33% <93.33%> (ø)`
daft/ai/provider.py	`47.16% <16.66%> (-8.39%)`	⬇️

... and 155 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

rchowell · 2025-09-04T16:59:28Z

daft/ai/vllm/protocols/text_embedder.py

+    def embed_text(self, text: list[str]) -> list[Embedding]:
+        outputs = self.model.embed(text)
+        embeddings = torch.tensor([o.outputs.embedding for o in outputs])
+        return embeddings.cpu().numpy()


+1

The Embedding type is currently an alias to np.ndarray, but these protocols should return a list[np.ndarray].

rchowell · 2025-09-04T17:03:58Z

daft/ai/vllm/provider.py

+    def get_text_embedder(self, model: str | None = None, **options: Any) -> TextEmbedderDescriptor:
+        from daft.ai.vllm.protocols.text_embedder import vLLMTextEmbedderDescriptor
+
+        return vLLMTextEmbedderDescriptor(model or "sentence-transformers/all-MiniLM-L6-v2", options)


We should include the provider name and options in the descriptor. Please see OpenAITextEmbedderDescriptor for an example. The provider name is for answering "who produced this descriptor?" and the options are useful when the provider requires late-initialization of 'something' which for OpenAI is the client — the provider takes the client options, but ultimately the must be plumbed into the descriptor->model instantiation.

rchowell · 2025-09-04T17:06:53Z

daft/ai/vllm/protocols/text_embedder.py

+        self.options = options
+
+    def embed_text(self, text: list[str]) -> list[Embedding]:
+        outputs = self.model.embed(text)


That easy huh? 😄

Do you know if we need to do any batching on our own? I'm fine with just firing away to vLLM until we learn more, mostly asking out of curiosity.

vLLM does batching internally. Tbf there's a lot more that can be done here, but I want to move on and support endpoints like gemini (and maybe hack together something for ev)

add

8217c3c

github-actions bot added the feat label Sep 4, 2025

desmondcheongzx added 2 commits September 4, 2025 00:51

add test

e5ae1b6

desmond in wonderland

b3bd4da

desmondcheongzx marked this pull request as ready for review September 4, 2025 07:57

greptile-apps bot reviewed Sep 4, 2025

View reviewed changes

desmondcheongzx added 2 commits September 4, 2025 01:04

add docs

ba849bc

add optional dependency

1567b08

rchowell reviewed Sep 4, 2025

View reviewed changes

rchowell approved these changes Sep 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(text_embed): Add vLLM as a provider #5136

feat(text_embed): Add vLLM as a provider #5136

Uh oh!

desmondcheongzx commented Sep 4, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Sep 4, 2025

Uh oh!

greptile-apps bot Sep 4, 2025

Uh oh!

rchowell Sep 4, 2025

Uh oh!

codecov bot commented Sep 4, 2025

Uh oh!

rchowell Sep 4, 2025

Uh oh!

rchowell Sep 4, 2025

Uh oh!

rchowell Sep 4, 2025

Uh oh!

desmondcheongzx Sep 5, 2025

Uh oh!

Uh oh!

feat(text_embed): Add vLLM as a provider #5136

Are you sure you want to change the base?

feat(text_embed): Add vLLM as a provider #5136

Uh oh!

Conversation

desmondcheongzx commented Sep 4, 2025

Changes Made

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Summary

Confidence score: 4/5

Uh oh!

greptile-apps bot Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

rchowell Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Sep 4, 2025

Codecov Report

Uh oh!

rchowell Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

rchowell Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

rchowell Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

desmondcheongzx Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!