-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
Description of the feature request:
I would like to request a feature that allows for tiered control over the web search domains used by the agent. This could be configured via environment variables or a configuration file. The system would include:
-
Whitelist: A list of mandatory domains. If this list is populated, the agent must exclusively use sources from these domains. All other domains are ignored.
-
Blacklist: A list of forbidden domains. The agent must never use sources from these domains, even if they appear in search results.
-
Graylist: A list of acceptable but not preferred domains. The agent can use these sources, but it should prioritize sources from the whitelist if possible. Results from graylisted sites could be flagged in the UI to indicate a lower trust level.
What problem are you trying to solve with this feature?
This tiered system provides nuanced control over the agent's information sources, which is critical for sophisticated applications.
-
Whitelist ensures the agent adheres to the highest-trust sources for critical queries.
-
Blacklist prevents the agent from accessing known unreliable sources, competitors' websites, or irrelevant content, improving answer safety and focus.
-
Graylist offers flexibility, allowing the agent to access a broader range of information while still acknowledging that these sources may not be as authoritative as those on the whitelist. This is useful for topics where primary sources are limited.
Any other information you'd like to share?
I've considered workarounds like trying to force the site: operator through prompt engineering or by modifying the agent's graph code directly. However, a built-in feature would be far more reliable and easier to configure for developers using this project.