Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions app/_kong_plugins/ai-semantic-response-guard/changelog.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
content_type: reference

---
## Changelog
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
description: Block or allow LLM responses based on semantic similarity to defined rules.

extended_description: |
The AI Semantic Response Guard plugin analyzes the full response from an LLM service and filters it
based on semantic similarity to configured allow or deny patterns.

Deny rules take precedence over allow rules. Responses matching a deny pattern are blocked,
even if they also match an allow pattern. Responses not matching any allow pattern are blocked
when allow rules are set.

title: 'Allow and deny responses'

weight: 900

requirements:
- "[AI Proxy plugin](/plugins/ai-proxy/) or [AI Proxy Advanced plugin](/plugins/ai-proxy-advanced/) configured with an LLM service."
- "A [Redis](https://redis.io/docs/latest/) instance or another supported vector database."
- "Port `6379`, or your custom Redis port, is open and reachable from {{site.base_gateway}}."

variables:
header_value:
value: $OPENAI_API_KEY
description: Your OpenAI API key
redis_host:
value: $REDIS_HOST
description: The host where your Redis instance runs

config:
embeddings:
auth:
header_name: Authorization
header_value: Bearer ${header_value}
model:
name: text-embedding-3-small
provider: openai
search:
threshold: 0.7
vectordb:
strategy: redis
distance_metric: cosine
threshold: 0.7
dimensions: 1024
redis:
host: ${redis_host}
port: 6379
rules:
allow_responses:
- Network troubleshooting and diagnostics
- Cloud infrastructure management (AWS, Azure, GCP)
- Cybersecurity best practices and incident response
- DevOps workflows and automation
- Programming concepts and language usage
- IT policy and compliance guidance
- Software development lifecycle and CI/CD
- Documentation writing and technical explanation
- System administration and configuration
- Productivity and collaboration tools usage
deny_responses:
- Hacking techniques or penetration testing without authorization
- Bypassing software licensing or digital rights management
- Instructions on exploiting vulnerabilities or writing malware
- Circumventing security controls or access restrictions
- Gathering personal or confidential employee information
- Using AI to impersonate or phish others
- Social engineering tactics or manipulation techniques
- Guidance on violating company IT policies
- Content unrelated to work, such as entertainment or dating
- Political, religious, or sensitive non-work-related discussions

tools:
- deck
- admin-api
- konnect-api
- kic
- terraform
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
description: Allow only specific LLM responses based on semantic similarity to defined rules.

extended_description: |
The AI Semantic Response Guard plugin analyzes the full response from an LLM service and permits it
only if it semantically matches one of the configured allow patterns.

If a response does not match any of the allow patterns, it is blocked with a 400 Bad Request.

title: 'Allow only responses'

weight: 900

requirements:
- "[AI Proxy plugin](/plugins/ai-proxy/) or [AI Proxy Advanced plugin](/plugins/ai-proxy-advanced/) configured with an LLM service."
- "A [Redis](https://redis.io/docs/latest/) instance or another supported vector database."
- "Port `6379`, or your custom Redis port, is open and reachable from {{site.base_gateway}}."

variables:
header_value:
value: $OPENAI_API_KEY
description: Your OpenAI API key
redis_host:
value: $REDIS_HOST
description: The host where your Redis instance runs

config:
embeddings:
auth:
header_name: Authorization
header_value: Bearer ${header_value}
model:
name: text-embedding-3-small
provider: openai
search:
threshold: 0.7
vectordb:
strategy: redis
distance_metric: cosine
threshold: 0.7
dimensions: 1024
redis:
host: ${redis_host}
port: 6379
rules:
allow_responses:
- Network troubleshooting and diagnostics
- Cloud infrastructure management (AWS, Azure, GCP)
- Cybersecurity best practices and incident response
- DevOps workflows and automation
- Programming concepts and language usage
- IT policy and compliance guidance
- Software development lifecycle and CI/CD
- Documentation writing and technical explanation
- System administration and configuration
- Productivity and collaboration tools usage

tools:
- deck
- admin-api
- konnect-api
- kic
- terraform
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
description: Block specific LLM responses based on semantic similarity to defined rules.

extended_description: |
The AI Semantic Response Guard plugin analyzes the full response from an LLM service and blocks it
if it semantically matches one of the configured deny patterns.

Responses that do not match any deny pattern are permitted.

title: 'Deny only responses'

weight: 900

requirements:
- "[AI Proxy plugin](/plugins/ai-proxy/) or [AI Proxy Advanced plugin](/plugins/ai-proxy-advanced/) configured with an LLM service."
- "A [Redis](https://redis.io/docs/latest/) instance or another supported vector database."
- "Port `6379`, or your custom Redis port, is open and reachable from {{site.base_gateway}}."

variables:
header_value:
value: $OPENAI_API_KEY
description: Your OpenAI API key
redis_host:
value: $REDIS_HOST
description: The host where your Redis instance runs

config:
embeddings:
auth:
header_name: Authorization
header_value: Bearer ${header_value}
model:
name: text-embedding-3-small
provider: openai
search:
threshold: 0.7
vectordb:
strategy: redis
distance_metric: cosine
threshold: 0.7
dimensions: 1024
redis:
host: ${redis_host}
port: 6379
rules:
deny_responses:
- Hacking techniques or penetration testing without authorization
- Bypassing software licensing or digital rights management
- Instructions on exploiting vulnerabilities or writing malware
- Circumventing security controls or access restrictions
- Gathering personal or confidential employee information
- Using AI to impersonate or phish others
- Social engineering tactics or manipulation techniques
- Guidance on violating company IT policies
- Content unrelated to work, such as entertainment or dating
- Political, religious, or sensitive non-work-related discussions

tools:
- deck
- admin-api
- konnect-api
- kic
- terraform
64 changes: 64 additions & 0 deletions app/_kong_plugins/ai-semantic-response-guard/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
title: 'AI Semantic Response Guard'
name: 'AI Semantic Response Guard'

content_type: plugin
tier: ai_gateway_enterprise

publisher: kong-inc
description: 'Permit or block prompts based on semantic similarity to known LLM responses, preventing misuse of llm/v1/chat or llm/v1/completions requests'

products:
- gateway
- ai-gateway

works_on:
- on-prem
- konnect

min_version:
gateway: '3.12'

topologies:
on_prem:
- hybrid
- db-less
- traditional
konnect_deployments:
- hybrid
- cloud-gateways
- serverless

icon: plugin-slug.png # e.g. acme.svg or acme.png

tags:
- ai
---

# AI Semantic Response Guard

The AI Semantic Response Guard plugin extends the AI Prompt Guard plugin by filtering LLM responses based on semantic similarity to predefined rules. It helps prevent unwanted or unsafe responses when serving `llm/v1/chat`, `llm/v1/completions`, or `llm/v1/embeddings` requests through Kong AI Gateway.

You can use a combination of `allow` and `deny` response rules to maintain integrity and compliance when returning responses from an LLM service.

## How it works

The plugin analyzes the semantic content of the full LLM response before it is returned to the client. The matching behavior is as follows:

* If any `deny_responses` are set and the response matches a pattern in the deny list, the response is blocked with a `400 Bad Request`.
* If any `allow_responses` are set, but the response matches none of the allowed patterns, the response is also blocked with a `400 Bad Request`.
* If any `allow_responses` are set and the response matches one of the allowed patterns, the response is permitted.
* If both `deny_responses` and `allow_responses` are set, the `deny` condition takes precedence. A response that matches a deny pattern will be blocked, even if it also matches an allow pattern. If the response does not match any deny pattern, it must still match an allow pattern to be permitted.

## Response processing

To enforce these rules, the plugin:

1. **Disables streaming** (`stream=false`) to ensure the full response body is buffered before analysis.
2. **Intercepts the response body** using the `guard-response` filter.
3. **Extracts response text**, supporting JSON parsing of multiple LLM formats and gzipped content.
4. **Generates embeddings** for the extracted text.
5. **Searches the vector database** (Redis, Pgvector, or other) against configured `allow_responses` or `deny_responses`.
6. **Applies the decision rules** described above.

If a response is blocked or if a system error occurs during evaluation, the plugin returns a `400 Bad Request` to the client without exposing that the Semantic Response Guard blocked it.
3 changes: 3 additions & 0 deletions app/_kong_plugins/ai-semantic-response-guard/reference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
---
content_type: reference
---
4 changes: 4 additions & 0 deletions app/_landing_pages/ai-gateway.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -431,6 +431,10 @@ rows:
- type: plugin
config:
slug: ai-aws-guardrails
- blocks:
- type: plugin
config:
slug: ai-semantic-response-guard
- blocks:
- type: card
config:
Expand Down
Loading