-
Notifications
You must be signed in to change notification settings - Fork 28
feat(ai-gateway): AI Semantic Response Guard plugin #2757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 6 commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
3fe82d3
Add scaffold plugin structure
tomek-labuk 44d48ed
Add WIP contend and examples
tomek-labuk 4ee2ae8
change config example
tomek-labuk 5e86edb
fix
tomek-labuk 47c3b0e
Merge branch 'release/gateway-3.12' into feat/ai-semantic-response-guard
tomek-labuk 41f876a
Update ai gw landing page
tomek-labuk 88d5d84
Add pgvecotr example
tomek-labuk 1d2d6a4
fixes
Guaris 954149f
icons
Guaris File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
--- | ||
content_type: reference | ||
|
||
--- | ||
## Changelog |
75 changes: 75 additions & 0 deletions
75
app/_kong_plugins/ai-semantic-response-guard/examples/allow-and-deny-responses.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
description: Block or allow LLM responses based on semantic similarity to defined rules. | ||
|
||
extended_description: | | ||
The AI Semantic Response Guard plugin analyzes the full response from an LLM service and filters it | ||
based on semantic similarity to configured allow or deny patterns. | ||
|
||
Deny rules take precedence over allow rules. Responses matching a deny pattern are blocked, | ||
even if they also match an allow pattern. Responses not matching any allow pattern are blocked | ||
when allow rules are set. | ||
|
||
title: 'Allow and deny responses' | ||
|
||
weight: 900 | ||
|
||
requirements: | ||
- "[AI Proxy plugin](/plugins/ai-proxy/) or [AI Proxy Advanced plugin](/plugins/ai-proxy-advanced/) configured with an LLM service." | ||
- "A [Redis](https://redis.io/docs/latest/) instance or another supported vector database." | ||
- "Port `6379`, or your custom Redis port, is open and reachable from {{site.base_gateway}}." | ||
|
||
variables: | ||
header_value: | ||
value: $OPENAI_API_KEY | ||
description: Your OpenAI API key | ||
redis_host: | ||
value: $REDIS_HOST | ||
description: The host where your Redis instance runs | ||
|
||
config: | ||
embeddings: | ||
auth: | ||
header_name: Authorization | ||
header_value: Bearer ${header_value} | ||
model: | ||
name: text-embedding-3-small | ||
provider: openai | ||
search: | ||
threshold: 0.7 | ||
vectordb: | ||
strategy: redis | ||
distance_metric: cosine | ||
threshold: 0.7 | ||
dimensions: 1024 | ||
redis: | ||
host: ${redis_host} | ||
port: 6379 | ||
rules: | ||
allow_responses: | ||
hackerchai marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Network troubleshooting and diagnostics | ||
- Cloud infrastructure management (AWS, Azure, GCP) | ||
- Cybersecurity best practices and incident response | ||
- DevOps workflows and automation | ||
- Programming concepts and language usage | ||
- IT policy and compliance guidance | ||
- Software development lifecycle and CI/CD | ||
- Documentation writing and technical explanation | ||
- System administration and configuration | ||
- Productivity and collaboration tools usage | ||
deny_responses: | ||
- Hacking techniques or penetration testing without authorization | ||
- Bypassing software licensing or digital rights management | ||
- Instructions on exploiting vulnerabilities or writing malware | ||
- Circumventing security controls or access restrictions | ||
- Gathering personal or confidential employee information | ||
- Using AI to impersonate or phish others | ||
- Social engineering tactics or manipulation techniques | ||
- Guidance on violating company IT policies | ||
- Content unrelated to work, such as entertainment or dating | ||
- Political, religious, or sensitive non-work-related discussions | ||
|
||
tools: | ||
- deck | ||
- admin-api | ||
- konnect-api | ||
- kic | ||
- terraform |
62 changes: 62 additions & 0 deletions
62
app/_kong_plugins/ai-semantic-response-guard/examples/allow-responses.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
description: Allow only specific LLM responses based on semantic similarity to defined rules. | ||
|
||
extended_description: | | ||
The AI Semantic Response Guard plugin analyzes the full response from an LLM service and permits it | ||
only if it semantically matches one of the configured allow patterns. | ||
|
||
If a response does not match any of the allow patterns, it is blocked with a 400 Bad Request. | ||
|
||
title: 'Allow only responses' | ||
|
||
weight: 900 | ||
|
||
requirements: | ||
- "[AI Proxy plugin](/plugins/ai-proxy/) or [AI Proxy Advanced plugin](/plugins/ai-proxy-advanced/) configured with an LLM service." | ||
- "A [Redis](https://redis.io/docs/latest/) instance or another supported vector database." | ||
- "Port `6379`, or your custom Redis port, is open and reachable from {{site.base_gateway}}." | ||
|
||
variables: | ||
header_value: | ||
value: $OPENAI_API_KEY | ||
description: Your OpenAI API key | ||
redis_host: | ||
value: $REDIS_HOST | ||
description: The host where your Redis instance runs | ||
|
||
config: | ||
embeddings: | ||
auth: | ||
header_name: Authorization | ||
header_value: Bearer ${header_value} | ||
model: | ||
name: text-embedding-3-small | ||
provider: openai | ||
search: | ||
threshold: 0.7 | ||
vectordb: | ||
strategy: redis | ||
distance_metric: cosine | ||
threshold: 0.7 | ||
dimensions: 1024 | ||
redis: | ||
host: ${redis_host} | ||
port: 6379 | ||
rules: | ||
allow_responses: | ||
hackerchai marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Network troubleshooting and diagnostics | ||
- Cloud infrastructure management (AWS, Azure, GCP) | ||
- Cybersecurity best practices and incident response | ||
- DevOps workflows and automation | ||
- Programming concepts and language usage | ||
- IT policy and compliance guidance | ||
- Software development lifecycle and CI/CD | ||
- Documentation writing and technical explanation | ||
- System administration and configuration | ||
- Productivity and collaboration tools usage | ||
|
||
tools: | ||
- deck | ||
- admin-api | ||
- konnect-api | ||
- kic | ||
- terraform |
62 changes: 62 additions & 0 deletions
62
app/_kong_plugins/ai-semantic-response-guard/examples/deny-responses.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
description: Block specific LLM responses based on semantic similarity to defined rules. | ||
|
||
extended_description: | | ||
The AI Semantic Response Guard plugin analyzes the full response from an LLM service and blocks it | ||
if it semantically matches one of the configured deny patterns. | ||
|
||
Responses that do not match any deny pattern are permitted. | ||
|
||
title: 'Deny only responses' | ||
|
||
weight: 900 | ||
|
||
requirements: | ||
- "[AI Proxy plugin](/plugins/ai-proxy/) or [AI Proxy Advanced plugin](/plugins/ai-proxy-advanced/) configured with an LLM service." | ||
- "A [Redis](https://redis.io/docs/latest/) instance or another supported vector database." | ||
- "Port `6379`, or your custom Redis port, is open and reachable from {{site.base_gateway}}." | ||
|
||
variables: | ||
header_value: | ||
value: $OPENAI_API_KEY | ||
description: Your OpenAI API key | ||
redis_host: | ||
value: $REDIS_HOST | ||
description: The host where your Redis instance runs | ||
|
||
config: | ||
embeddings: | ||
auth: | ||
header_name: Authorization | ||
header_value: Bearer ${header_value} | ||
model: | ||
name: text-embedding-3-small | ||
provider: openai | ||
search: | ||
threshold: 0.7 | ||
vectordb: | ||
strategy: redis | ||
distance_metric: cosine | ||
threshold: 0.7 | ||
dimensions: 1024 | ||
redis: | ||
host: ${redis_host} | ||
port: 6379 | ||
rules: | ||
deny_responses: | ||
- Hacking techniques or penetration testing without authorization | ||
- Bypassing software licensing or digital rights management | ||
- Instructions on exploiting vulnerabilities or writing malware | ||
- Circumventing security controls or access restrictions | ||
- Gathering personal or confidential employee information | ||
- Using AI to impersonate or phish others | ||
- Social engineering tactics or manipulation techniques | ||
- Guidance on violating company IT policies | ||
- Content unrelated to work, such as entertainment or dating | ||
- Political, religious, or sensitive non-work-related discussions | ||
|
||
tools: | ||
- deck | ||
- admin-api | ||
- konnect-api | ||
- kic | ||
- terraform |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
--- | ||
title: 'AI Semantic Response Guard' | ||
name: 'AI Semantic Response Guard' | ||
|
||
content_type: plugin | ||
tier: ai_gateway_enterprise | ||
|
||
publisher: kong-inc | ||
description: 'Permit or block prompts based on semantic similarity to known LLM responses, preventing misuse of llm/v1/chat or llm/v1/completions requests' | ||
|
||
products: | ||
- gateway | ||
- ai-gateway | ||
|
||
works_on: | ||
- on-prem | ||
- konnect | ||
|
||
min_version: | ||
gateway: '3.12' | ||
|
||
topologies: | ||
on_prem: | ||
- hybrid | ||
- db-less | ||
- traditional | ||
konnect_deployments: | ||
- hybrid | ||
- cloud-gateways | ||
- serverless | ||
|
||
icon: plugin-slug.png # e.g. acme.svg or acme.png | ||
|
||
tags: | ||
- ai | ||
--- | ||
|
||
# AI Semantic Response Guard | ||
|
||
The AI Semantic Response Guard plugin extends the AI Prompt Guard plugin by filtering LLM responses based on semantic similarity to predefined rules. It helps prevent unwanted or unsafe responses when serving `llm/v1/chat`, `llm/v1/completions`, or `llm/v1/embeddings` requests through Kong AI Gateway. | ||
|
||
You can use a combination of `allow` and `deny` response rules to maintain integrity and compliance when returning responses from an LLM service. | ||
|
||
## How it works | ||
|
||
The plugin analyzes the semantic content of the full LLM response before it is returned to the client. The matching behavior is as follows: | ||
|
||
* If any `deny_responses` are set and the response matches a pattern in the deny list, the response is blocked with a `400 Bad Request`. | ||
* If any `allow_responses` are set, but the response matches none of the allowed patterns, the response is also blocked with a `400 Bad Request`. | ||
* If any `allow_responses` are set and the response matches one of the allowed patterns, the response is permitted. | ||
* If both `deny_responses` and `allow_responses` are set, the `deny` condition takes precedence. A response that matches a deny pattern will be blocked, even if it also matches an allow pattern. If the response does not match any deny pattern, it must still match an allow pattern to be permitted. | ||
|
||
## Response processing | ||
|
||
To enforce these rules, the plugin: | ||
|
||
1. **Disables streaming** (`stream=false`) to ensure the full response body is buffered before analysis. | ||
2. **Intercepts the response body** using the `guard-response` filter. | ||
3. **Extracts response text**, supporting JSON parsing of multiple LLM formats and gzipped content. | ||
4. **Generates embeddings** for the extracted text. | ||
5. **Searches the vector database** (Redis, Pgvector, or other) against configured `allow_responses` or `deny_responses`. | ||
6. **Applies the decision rules** described above. | ||
|
||
If a response is blocked or if a system error occurs during evaluation, the plugin returns a `400 Bad Request` to the client without exposing that the Semantic Response Guard blocked it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
--- | ||
content_type: reference | ||
--- |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.