-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Motivation
Today, in app.py if we look at the ask_question
method here, we can notice, all the questions get passed in the prompt along with the user query.
I want to allow this behaviour
- Every question can be tagged with a label or a list of labels
ask_question
should receive a list of labels- Based on the labels, the questions should get filtered when answering a particular query
- Only the filtered questions should be injected in the prompt which is supposed to answer the user's query
Solution Proposal
Modify the question object to also include a list of tags. Something like
class Question:
id: str
variants: list[str]
answer: str
tags: list[str] = field(default_factory=list)
Allow the ask_question method to receive a list of tags/labels. Something as follows
async def ask_question(self, question: str, tags: Optional[list[str]] = None) -> Answer:
Based on the tags, filter the questions and then stuff them into the context instead of directly adding all the questions. Allow _format_background_info function to receive a list of tags and implement the filtering using a new method. Something as follows
def _format_background_info(self, tags: Optional[list[str]] = None) -> str:
filtered_questions = self._filter_questions(tags)
And use filtered_questions
instead of self._questions
in the subsequent portion of _format_background_info
function.
Discussion
There could be optimizations later to add a function which also implements the logic to identify the tags/labels to which the user query belongs. The more specific the context the better because
- It will give the users control over which qnas the LLM should or shouldn't see to answer specific things
- Specific context -> Less qnas -> Less tokens -> Low latency -> Low cost
- Possibly improved accuracy because now the LLM has to deal with less and focused information and there is less chance of it being affected by recency bias i.e. If the relevant answer is there in 1st and 2nd question but there are around 150 questions, it is an observed phenomenon that LLMs tend to give more attention to the recent tokens than to the tokens which occur far earlier when generating the next token. Now, if instead of 150 there are let's say 20 questions which can be filtered based on tags, that would improve the probability of finding the answer as opposed to the case where we had 150 questions.