[gpt-oss][1b] streaming add item id, content id #24788

qandrew · 2025-09-12T23:32:00Z

Purpose

implement the FIXMEs, so we now track item_id and content_id, as defined in OpenAI spec (ie https://platform.openai.com/docs/api-reference/responses_streaming/response/content_part/added)
added a unit test

Test Plan

curl http://localhost:20001/v1/responses   -H "Content-Type: application/json"   -N   -d '{
    "model": "/data/users/axia/checkpoints/gpt-oss-120b",
    "input": [
        {
            "role": "user",
            "content": "Hello."
        }
    ],
    "temperature": 0.7,
    "max_output_tokens": 256,
    "stream": true
}'

Test Result

Note: the item id, content_index changes as expected

event: response.output_item.added
data: {"item":{"id":"msg_81a71aae8e5f4af0a7c5d2f0e6bc6324","summary":[],"type":"reasoning","content":null,"encrypted_content":null,"status":"in_progress"},"output_index":0,"sequence_number":2,"type":"response.output_item.added"}

event: response.content_part.added
data: {"content_index":0,"item_id":"msg_81a71aae8e5f4af0a7c5d2f0e6bc6324","output_index":0,"part":{"annotations":[],"text":"","type":"output_text","logprobs":[]},"sequence_number":3,"type":"response.content_part.added"}

event: response.reasoning_text.delta
data: {"content_index":0,"delta":"The","item_id":"msg_81a71aae8e5f4af0a7c5d2f0e6bc6324","output_index":0,"sequence_number":4,"type":"response.reasoning_text.delta"}

event: response.output_item.done
data: {"item":{"id":"msg_81a71aae8e5f4af0a7c5d2f0e6bc6324","summary":[],"type":"reasoning","content":[{"text":"The user says \"Hello.\" Likely they want a greeting. We respond politely.","type":"reasoning_text"}],"encrypted_content":null,"status":"completed"},"output_index":1,"sequence_number":22,"type":"response.output_item.done"}

event: response.output_item.added
data: {"item":{"id":"msg_c8b805f6ff3b42948a1debd89a2961eb","content":[],"role":"assistant","status":"in_progress","type":"message"},"output_index":1,"sequence_number":23,"type":"response.output_item.added"}

event: response.content_part.added
data: {"content_index":1,"item_id":"msg_c8b805f6ff3b42948a1debd89a2961eb","output_index":1,"part":{"annotations":[],"text":"","type":"output_text","logprobs":[]},"sequence_number":24,"type":"response.content_part.added"}

event: response.output_text.delta
data: {"content_index":1,"delta":"Hello","item_id":"msg_c8b805f6ff3b42948a1debd89a2961eb","logprobs":[],"output_index":1,"sequence_number":25,"type":"response.output_text.delta"}

event: response.output_text.delta
data: {"content_index":1,"delta":"!","item_id":"msg_c8b805f6ff3b42948a1debd89a2961eb","logprobs":[],"output_index":1,"sequence_number":26,"type":"response.output_text.delta"}

event: response.output_text.delta
data: {"content_index":1,"delta":" How","item_id":"msg_c8b805f6ff3b42948a1debd89a2961eb","logprobs":[],"output_index":1,"sequence_number":27,"type":"response.output_text.delta"}

...

OAI example:

ResponseCreatedEvent(response=Response(id='resp_68bf41ccb7f881a3b89e1bbc39dca02008013d49ada9bc0a', created_at=1757364684.0, error=None, incomplete_details=None, instructions=None, metadata={}, model='gpt-5-2025-08-07', object='response', output=[], parallel_tool_calls=True, temperature=1.0, tool_choice='auto', tools=[], top_p=1.0, background=False, conversation=None, max_output_tokens=None, max_tool_calls=None, previous_response_id=None, prompt=None, prompt_cache_key=None, reasoning=Reasoning(effort='medium', generate_summary=None, summary='detailed'), safety_identifier=None, service_tier='auto', status='in_progress', text=ResponseTextConfig(format=ResponseFormatText(type='text'), verbosity='medium'), top_logprobs=0, truncation='disabled', usage=None, user=None, store=True), sequence_number=0, type='response.created')
ResponseInProgressEvent(response=Response(id='resp_68bf41ccb7f881a3b89e1bbc39dca02008013d49ada9bc0a', created_at=1757364684.0, error=None, incomplete_details=None, instructions=None, metadata={}, model='gpt-5-2025-08-07', object='response', output=[], parallel_tool_calls=True, temperature=1.0, tool_choice='auto', tools=[], top_p=1.0, background=False, conversation=None, max_output_tokens=None, max_tool_calls=None, previous_response_id=None, prompt=None, prompt_cache_key=None, reasoning=Reasoning(effort='medium', generate_summary=None, summary='detailed'), safety_identifier=None, service_tier='auto', status='in_progress', text=ResponseTextConfig(format=ResponseFormatText(type='text'), verbosity='medium'), top_logprobs=0, truncation='disabled', usage=None, user=None, store=True), sequence_number=1, type='response.in_progress')
ResponseOutputItemAddedEvent(item=ResponseReasoningItem(id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', summary=[], type='reasoning', content=None, encrypted_content=None, status=None), output_index=0, sequence_number=2, type='response.output_item.added')
ResponseReasoningSummaryPartAddedEvent(item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, part=Part(text='', type='summary_text'), sequence_number=3, summary_index=0, type='response.reasoning_summary_part.added')
ResponseReasoningSummaryTextDeltaEvent(delta='**Multip', item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, sequence_number=4, summary_index=0, type='response.reasoning_summary_text.delta', obfuscation='gunpSbbS')
ResponseReasoningSummaryTextDeltaEvent(delta='lying', item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, sequence_number=5, summary_index=0, type='response.reasoning_summary_text.delta', obfuscation='Y5Doun05WUq')
ResponseReasoningSummaryTextDeltaEvent(delta=' with', item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, sequence_number=6, summary_index=0, type='response.reasoning_summary_text.delta', obfuscation='fIXeF6P3dpv')
ResponseReasoningSummaryTextDeltaEvent(delta=' precision', item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, sequence_number=7, summary_index=0, type='response.reasoning_summary_text.delta', obfuscation='EXuwuv')
...
ResponseReasoningSummaryTextDoneEvent(item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, sequence_number=105, summary_index=0, text='**Multiplying with precision**\n\nI need to multiply 3.2342 by 233.1123123. The user likely meant "multiply" when they wrote "multiple," so I’ll compute the product accurately. \n\nI\'ll start by using standard multiplication methods and breaking down the calculations into parts. First, I\'ll calculate b * 3 and then b * 0.2342. After summing up everything, I arrive at approximately 753.9318. Now I should double-check my work to ensure the accuracy of this result.', type='response.reasoning_summary_text.done')
ResponseReasoningSummaryPartDoneEvent(item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, part=Part(text='**Multiplying with precision**\n\nI need to multiply 3.2342 by 233.1123123. The user likely meant "multiply" when they wrote "multiple," so I’ll compute the product accurately. \n\nI\'ll start by using standard multiplication methods and breaking down the calculations into parts. First, I\'ll calculate b * 3 and then b * 0.2342. After summing up everything, I arrive at approximately 753.9318. Now I should double-check my work to ensure the accuracy of this result.', type='summary_text'), sequence_number=106, summary_index=0, type='response.reasoning_summary_part.done')
ResponseReasoningSummaryPartAddedEvent(item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, part=Part(text='', type='summary_text'), sequence_number=107, summary_index=1, type='response.reasoning_summary_part.added')
ResponseReasoningSummaryTextDeltaEvent(delta='**Ver', item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, sequence_number=108, summary_index=1, type='response.reasoning_summary_text.delta', obfuscation='fAEej8BrrXI')
ResponseReasoningSummaryTextDeltaEvent(delta='ifying', item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, sequence_number=109, summary_index=1, type='response.reasoning_summary_text.delta', obfuscation='y3VTtzA8kO')
ResponseReasoningSummaryTextDeltaEvent(delta=' multiplication', item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, sequence_number=110, summary_index=1, type='response.reasoning_summary_text.delta', obfuscation='O')
ResponseReasoningSummaryTextDeltaEvent(delta=' accuracy', item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, sequence_number=111, summary_index=1, type='response.reasoning_summary_text.delta', obfuscation='8EOQcwh')
ResponseReasoningSummaryTextDeltaEvent(delta='**\n\nI', item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, sequence_number=112, summary_index=1, type='response.reasoning_summary_text.delta', obfuscation='PNZJtO5JRJd')
ResponseReasoningSummaryTextDeltaEvent(delta='’m', item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, sequence_number=113, summary_index=1, type='response.reasoning_summary_text.delta', obfuscation='bD9H4GV0ImvosW')
ResponseReasoningSummaryTextDeltaEvent(delta=' considering', item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, sequence_number=114, summary_index=1, type='response.reasoning_summary_text.delta', obfuscation='q7tS')
ResponseReasoningSummaryTextDeltaEvent(delta=' using', item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, sequence_number=115, summary_index=1, type='response.reasoning_summary_text.delta', obfuscation='ozdIa1egB2')
ResponseReasoningSummaryTextDeltaEvent(delta=' Python', item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, sequence_number=116, summary_index=1, type='response.reasoning_summary_text.delta', obfuscation='ZqZS6kqzp')
...
ResponseReasoningSummaryPartDoneEvent(item_id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', output_index=0, part=Part(text='**Determining significant digits**\n\nI can confirm that the multiplication of \\( 3.2342 \\) (4 decimal places) and \\( 233.1123123 \\) (7 decimal places) results in a number with at most 11 decimal places. My computed result, \\( 753.93184044066 \\), indeed has 11 digits after the decimal. \n\nI should correct the user’s phrasing of "multiple" to "multiply." It’s good to be concise, so I’ll present the product clearly, maybe with a brief explanation or a one-liner to keep it straightforward. Let\'s go ahead and show that computed result!', type='summary_text'), sequence_number=371, summary_index=2, type='response.reasoning_summary_part.done')
ResponseOutputItemDoneEvent(item=ResponseReasoningItem(id='rs_68bf41cd40c081a3a772340ccea427bf08013d49ada9bc0a', summary=[Summary(text='**Multiplying with precision**\n\nI need to multiply 3.2342 by 233.1123123. The user likely meant "multiply" when they wrote "multiple," so I’ll compute the product accurately. \n\nI\'ll start by using standard multiplication methods and breaking down the calculations into parts. First, I\'ll calculate b * 3 and then b * 0.2342. After summing up everything, I arrive at approximately 753.9318. Now I should double-check my work to ensure the accuracy of this result.', type='summary_text'), Summary(text='**Verifying multiplication accuracy**\n\nI’m considering using Python for high-precision calculations of \\( a \\times b \\). First, I’ll confirm the previous result: \\( 3.2342 \\times 233.1123123 \\). I recalled that \\( 233.1123123 \\times 3 \\) equals approximately \\( 699.3369369 \\), which checks out. \n\nNow, I’ll look closely at calculating \\( 0.2342 \\times b \\). A direct computation gives around \\( 54.59490354066 \\), and it matches my previous sum. The total is accurately \\( 753.93184044066 \\), which I’ll present as the final rounded result.', type='summary_text'), Summary(text='**Determining significant digits**\n\nI can confirm that the multiplication of \\( 3.2342 \\) (4 decimal places) and \\( 233.1123123 \\) (7 decimal places) results in a number with at most 11 decimal places. My computed result, \\( 753.93184044066 \\), indeed has 11 digits after the decimal. \n\nI should correct the user’s phrasing of "multiple" to "multiply." It’s good to be concise, so I’ll present the product clearly, maybe with a brief explanation or a one-liner to keep it straightforward. Let\'s go ahead and show that computed result!', type='summary_text')], type='reasoning', content=None, encrypted_content=None, status=None), output_index=0, sequence_number=372, type='response.output_item.done')
ResponseOutputItemAddedEvent(item=ResponseOutputMessage(id='msg_68bf41e9d4f481a3a52a8532f944df3808013d49ada9bc0a', content=[], role='assistant', status='in_progress', type='message'), output_index=1, sequence_number=373, type='response.output_item.added')
ResponseContentPartAddedEvent(content_index=0, item_id='msg_68bf41e9d4f481a3a52a8532f944df3808013d49ada9bc0a', output_index=1, part=ResponseOutputText(annotations=[], text='', type='output_text', logprobs=[]), sequence_number=374, type='response.content_part.added')
ResponseTextDeltaEvent(content_index=0, delta='3', item_id='msg_68bf41e9d4f481a3a52a8532f944df3808013d49ada9bc0a', logprobs=[], output_index=1, sequence_number=375, type='response.output_text.delta', obfuscation='5UpTHBZr8ovDHaN')
ResponseTextDeltaEvent(content_index=0, delta='.', item_id='msg_68bf41e9d4f481a3a52a8532f944df3808013d49ada9bc0a', logprobs=[], output_index=1, sequence_number=376, type='response.output_text.delta', obfuscation='M130VTwll5a8aEc')
ResponseTextDeltaEvent(content_index=0, delta='234', item_id='msg_68bf41e9d4f481a3a52a8532f944df3808013d49ada9bc0a', logprobs=[], output_index=1, sequence_number=377, type='response.output_text.delta', obfuscation='An2UEHoufSxRq')
ResponseTextDeltaEvent(content_index=0, delta='2', item_id='msg_68bf41e9d4f481a3a52a8532f944df3808013d49ada9bc0a', logprobs=[], output_index=1, sequence_number=378, type='response.output_text.delta', obfuscation='Lt6gv4l7d2FzIxv')

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Andrew Xia <[email protected]>

yeqcharlotte · 2025-09-14T00:26:20Z

tests/entrypoints/openai/test_response_api_with_harmony.py

        )

+        current_item_id = ""
+        current_content_index = -1


this test case doesn't capture multiple subsequent streaming items?

hmm i'm not sure if i understand your question? I haven't enabled the tool calling for streaming yet, so currently we're only testing reasoningOutput -> finalOutput items.

chaunceyjiang

Could you provide a sample response from the OpenAI online service?

qandrew · 2025-09-15T17:12:25Z

Could you provide a sample response from the OpenAI online service?

yep, added in the description

Signed-off-by: Andrew Xia <[email protected]>

chaunceyjiang

Thanks~

It looks consistent with OpenAI’s format.

Signed-off-by: Andrew Xia <[email protected]>

mergify bot added frontend gpt-oss Related to GPT-OSS models labels Sep 12, 2025

item id, content id

e366b57

Signed-off-by: Andrew Xia <[email protected]>

qandrew force-pushed the andrew/gpt-oss-streaming-ids branch from 4beb7e5 to e366b57 Compare September 12, 2025 23:43

qandrew marked this pull request as ready for review September 12, 2025 23:56

qandrew requested review from DarkLight1337, robertgshaw2-redhat, simon-mo, aarnphm, NickLucche and chaunceyjiang as code owners September 12, 2025 23:56

unit test

f2bf243

Signed-off-by: Andrew Xia <[email protected]>

qandrew force-pushed the andrew/gpt-oss-streaming-ids branch from d2fd245 to f2bf243 Compare September 13, 2025 00:04

qandrew mentioned this pull request Sep 13, 2025

[gpt-oss][1][bugfix] fix streaming final output #24466

Merged

5 tasks

yeqcharlotte mentioned this pull request Sep 13, 2025

[Feature][Responses API] Streaming ID Alignment #23218

Open

1 task

yeqcharlotte reviewed Sep 14, 2025

View reviewed changes

yeqcharlotte added this to gpt-oss Issues & Enhancements Sep 14, 2025

github-project-automation bot moved this to To Triage in gpt-oss Issues & Enhancements Sep 14, 2025

yeqcharlotte moved this from To Triage to In progress in gpt-oss Issues & Enhancements Sep 14, 2025

chaunceyjiang reviewed Sep 15, 2025

View reviewed changes

qandrew requested review from chaunceyjiang and yeqcharlotte September 15, 2025 17:12

qandrew changed the title ~~[gpt-oss] streaming add item id, content id~~ [gpt-oss][1b] streaming add item id, content id Sep 15, 2025

qandrew added 2 commits September 15, 2025 13:26

Merge branch 'main' into andrew/gpt-oss-streaming-ids

e809bdb

Signed-off-by: Andrew Xia <[email protected]>

fix merge conflict

1bd029f

Signed-off-by: Andrew Xia <[email protected]>

chaunceyjiang approved these changes Sep 16, 2025

View reviewed changes

github-project-automation bot moved this from In progress to Ready in gpt-oss Issues & Enhancements Sep 16, 2025

chaunceyjiang self-assigned this Sep 16, 2025

chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 16, 2025

Merge branch 'main' into andrew/gpt-oss-streaming-ids

54cab3f

zou3519 enabled auto-merge (squash) September 16, 2025 17:07

zou3519 merged commit f4d6eb9 into vllm-project:main Sep 16, 2025
44 checks passed

github-project-automation bot moved this from Ready to Done in gpt-oss Issues & Enhancements Sep 16, 2025

frank-wei pushed a commit to frank-wei/vllm that referenced this pull request Sep 23, 2025

[gpt-oss][1b] streaming add item id, content id (vllm-project#24788)

2c74a3c

Signed-off-by: Andrew Xia <[email protected]>

langc23 pushed a commit to zte-riscv/vllm that referenced this pull request Sep 23, 2025

[gpt-oss][1b] streaming add item id, content id (vllm-project#24788)

8a63892

Signed-off-by: Andrew Xia <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[gpt-oss][1b] streaming add item id, content id #24788

[gpt-oss][1b] streaming add item id, content id #24788

Uh oh!

qandrew commented Sep 12, 2025 •

edited by github-actions bot

Loading

Uh oh!

yeqcharlotte Sep 14, 2025

Uh oh!

qandrew Sep 15, 2025

Uh oh!

chaunceyjiang left a comment

Uh oh!

qandrew commented Sep 15, 2025

Uh oh!

chaunceyjiang left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[gpt-oss][1b] streaming add item id, content id #24788

[gpt-oss][1b] streaming add item id, content id #24788

Uh oh!

Conversation

qandrew commented Sep 12, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

yeqcharlotte Sep 14, 2025

Choose a reason for hiding this comment

Uh oh!

qandrew Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang left a comment

Choose a reason for hiding this comment

Uh oh!

qandrew commented Sep 15, 2025

Uh oh!

chaunceyjiang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

qandrew commented Sep 12, 2025 •

edited by github-actions bot

Loading