Add OpenAiClient stream handling, update types #230

declark1 · 2024-10-15T19:39:30Z

This adds handling to return a stream response when stream: true and applies type updates based on initial testing with vLLM & llama-3-8b-instruct.

Notes:

The stream error handling probably needs more work, we can do this as part of the orchestrator chat implementation

evaline-ju

some small initial comments, thanks for the addition!

The stream error handling probably needs more work, we can do this as part of the orchestrator chat implementation

Let’s make sure to note this as part of #192 or whatever appropriate issue?

Cargo.toml

src/clients/openai.rs

Signed-off-by: declark1 <[email protected]>

evaline-ju

LGTM with nit

evaline-ju · 2024-10-30T19:43:42Z

src/clients/openai.rs

+    pub tool_calls: Vec<ToolCall>,
+    /// The role of the author of this message.
+    #[serde(skip_serializing_if = "Option::is_none")]
+    pub role: Option<String>,


nit: is role actually optional? I didn't happen to see a or null on https://platform.openai.com/docs/api-reference/chat/streaming

I made it Option for the ChatCompletionDelta object (used in streaming responses) since their documentation has examples without it included, e.g. the second one below.

{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]} {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"content":"Hello"},"logprobs":null,"finish_reason":null}]}

I just remembered. I believe this is optional (although not properly documented) for the streaming responses because the chunk is part of the assistant's response. In the example above, {"content":"Hello"} is part of the assistant's response, so it keeps the messages smaller by not repeating {"role":"assistant"} in each chunk.

declark1 requested review from gkumbhat and evaline-ju as code owners October 15, 2024 19:39

declark1 requested a review from mdevino October 15, 2024 19:41

evaline-ju reviewed Oct 23, 2024

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

src/clients/openai.rs Outdated Show resolved Hide resolved

declark1 force-pushed the openai_stream branch from a4986e4 to 5d9b9bf Compare October 29, 2024 19:22

declark1 requested a review from evaline-ju October 29, 2024 21:55

declark1 added 5 commits October 30, 2024 10:03

Add initial stream handling for OpenAiClient

4cd3cb4

Signed-off-by: declark1 <[email protected]>

Update OpenAiClient types and fix streaming request

ae5a587

Signed-off-by: declark1 <[email protected]>

Add OpenAiError and parse error message, order dependencies

0cb6f34

Signed-off-by: declark1 <[email protected]>

Drop openai client tests module

aeed56f

Signed-off-by: declark1 <[email protected]>

Telemetry rebase and updates

5d5a0f0

Signed-off-by: declark1 <[email protected]>

declark1 force-pushed the openai_stream branch from f8d7402 to 5d5a0f0 Compare October 30, 2024 17:07

declark1 added 2 commits October 30, 2024 11:45

Return SSE events directly from OpenAiClient stream

d442da0

Signed-off-by: declark1 <[email protected]>

Add headers to OpenAiClient chat_completions method

a9169fc

Signed-off-by: declark1 <[email protected]>

declark1 mentioned this pull request Oct 30, 2024

Implement Chat Completions API #240

Merged

evaline-ju approved these changes Oct 30, 2024

View reviewed changes

declark1 merged commit 521c80f into foundation-model-stack:main Oct 30, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add OpenAiClient stream handling, update types #230

Add OpenAiClient stream handling, update types #230

Uh oh!

declark1 commented Oct 15, 2024 •

edited

Loading

Uh oh!

evaline-ju left a comment

Uh oh!

Uh oh!

Uh oh!

evaline-ju left a comment

Uh oh!

evaline-ju Oct 30, 2024

Uh oh!

declark1 Oct 30, 2024 •

edited

Loading

Uh oh!

declark1 Oct 30, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Add OpenAiClient stream handling, update types #230

Add OpenAiClient stream handling, update types #230

Uh oh!

Conversation

declark1 commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

evaline-ju left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

evaline-ju left a comment

Choose a reason for hiding this comment

Uh oh!

evaline-ju Oct 30, 2024

Choose a reason for hiding this comment

Uh oh!

declark1 Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

declark1 Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

declark1 commented Oct 15, 2024 •

edited

Loading

declark1 Oct 30, 2024 •

edited

Loading

declark1 Oct 30, 2024 •

edited

Loading