[ENH] (somewhat) systematic evaluation tool for finding optimal hyperparams.

It'd be nice to have some benchmarking mechanism that tells whether a modification to the retrieval process/chunking method helps produce better retrieval results.

Initial plan:
Use a large and powerful reranker as the grader. This grader should produce a score for each keyword-result pair, allowing us to directly measure how good the reranker used by VectorCode during the query process is. 
- The grader model should be sufficiently large and able to handle super-long context (long enough for the largest single file in the test repository to fit in);
- Change as few hyperparameters in each run as possible;
- Potentially implementing a grid-search like procedure for tuning the hyperparams;
- We'd be able to evaluate changes to the chunking/reranking mechanism.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ENH] (somewhat) systematic evaluation tool for finding optimal hyperparams. #67

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[ENH] (somewhat) systematic evaluation tool for finding optimal hyperparams. #67

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions