-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
Description of the feature request:
If I understand correctly, currently both generate_query
and web_research
nodes write to the same search_query
key in OverallState
, which is declared as:
search_query: Annotated[list, operator.add]
Because of the operator.add
reducer, this causes search_query
to accumulate both the original planned queries and the queries that have been executed, leading to duplication.
We should probably separate the concepts of planned queries and executed queries in state management.
What problem are you trying to solve with this feature?
This design makes len(state["search_query"])
misleading, since it double-counts queries (planned + executed).
In the current flow:
-
generate_query sets
search_query = ["q1", "q2", "q3"]
-
Each
web_research
branch appends its own query:["q1", "q2", "q3", "q1", "q2", "q3"]
-
Reflection calculates:
number_of_ran_queries = len(state["search_query"])
(6, but only 3 actually ran) -
This inflated number is used to offset IDs for follow-up queries. Although because IDs remain unique, the count is actually incorrect. It works but any future logic relying on an accurate count will be wrong.
Any other information you'd like to share?
Two possible solutions:
1) Separate state keys
-
planned_queries: Annotated[list[str], operator.add]
-> updated bygenerate_query
and follow-up generation -
executed_queries: Annotated[list[str], operator.add]
-> updated byweb_research
-
number_of_ran_queries = len(state.get("executed_queries", []))
2) Keep one list but stop re-adding in web_research
-
Remove
search_query
write fromweb_research
-
Add a
ran_count: Annotated[int, operator.add]
counter instead -
Reflection uses
ran_count