Skip to content

Conversation

pramodith
Copy link

@pramodith pramodith commented Sep 23, 2025

What does this PR do?

Fixes the notebook to disable bf16 because the model and lora weights are configured to load in bf16

Fixes # (issue)

Who can review?

Feel free to tag members/contributors who may be interested in your PR.
@stevhliu

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@pramodith pramodith changed the title Pramodith patch 1 Fix GRPO Reasoning Advanced Reward Tutorial Sep 23, 2025
@stevhliu
Copy link
Member

Pinging @behroozazarkhalili who contributed this notebook

@behroozazarkhalili
Copy link
Contributor

Pinging @behroozazarkhalili who contributed this notebook

Hi,
I'll correct it and send a new pull request this weekend as I'm approaching ACL deadline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants