Skip to content

Conversation

claudiosv
Copy link
Collaborator

In this PR, I add process_gsm8k.py, process_fever.py, and process_mbpp.py. These scripts process the datasets into the format used by the optimizer (AutoPDL), including the construction of agentic trajectories.

Signed-off-by: Claudio Spiess <[email protected]>
Signed-off-by: Claudio Spiess <[email protected]>
Signed-off-by: Claudio Spiess <[email protected]>
Signed-off-by: Claudio Spiess <[email protected]>
Signed-off-by: Claudio Spiess <[email protected]>
Signed-off-by: Claudio Spiess <[email protected]>
Copy link
Collaborator

@mandel mandel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

I think that you have everything now to finish autopdl.md where you left a placeholder on how to build the dataset.

@claudiosv
Copy link
Collaborator Author

Looks great.

I think that you have everything now to finish autopdl.md where you left a placeholder on how to build the dataset.

Thanks for the review! I'll finish up the md file, dependency, and merge.

claudiosv added 3 commits July 2, 2025 14:18
Signed-off-by: Claudio Spiess <[email protected]>
Signed-off-by: Claudio Spiess <[email protected]>
Signed-off-by: Claudio Spiess <[email protected]>
Signed-off-by: Claudio Spiess <[email protected]>
@claudiosv claudiosv merged commit e84afe8 into main Jul 7, 2025
11 of 15 checks passed
@claudiosv claudiosv deleted the optim-data branch July 7, 2025 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants