feat(ai-gateway): AI Semantic Response Guard plugin (#2757)

tomek-labuk · Guaris · web-flow · commit 7bba93f477bb · 2025-09-09T14:51:00.000-04:00
* Add scaffold plugin structure

* Add WIP contend and examples

* change config example

* fix

* Update ai gw landing page

* Add pgvecotr example

* fixes

* icons

---------

Co-authored-by: Angel &lt;angel.guarisma@konghq.com&gt;
diff --git a/app/_kong_plugins/ai-semantic-response-guard/changelog.md b/app/_kong_plugins/ai-semantic-response-guard/changelog.md
@@ -0,0 +1,9 @@
+{
+    "3.12.0.0": [
+      {
+        "message": "Added new plugin to permit or block prompts based on semantic similarity to known LLM responses, preventing misuse of llm/v1/chat or llm/v1/completions requests",
+        "scope": "Plugin",
+        "type": "feature"
+      }
+    ]
+  }
diff --git a/app/_kong_plugins/ai-semantic-response-guard/examples/allow-and-deny-responses-pgvector.yaml b/app/_kong_plugins/ai-semantic-response-guard/examples/allow-and-deny-responses-pgvector.yaml
@@ -0,0 +1,89 @@
+description: Block or allow LLM responses based on semantic similarity to defined rules.
+
+extended_description: |
+  The AI Semantic Response Guard plugin analyzes the full response from an LLM service and filters it
+  based on semantic similarity to configured allow or deny patterns.
+
+  Deny rules take precedence over allow rules. Responses matching a deny pattern are blocked,
+  even if they also match an allow pattern. Responses not matching any allow pattern are blocked
+  when allow rules are set.
+
+title: 'Allow and deny using pgvector as a vector database'
+
+weight: 900
+
+requirements:
+  - "[AI Proxy plugin](/plugins/ai-proxy/) or [AI Proxy Advanced plugin](/plugins/ai-proxy-advanced/) configured with an LLM service."
+  - "A [PostgreSQL database with pgvector extension](https://github.com/pgvector/pgvector) installed and reachable from {{site.base_gateway}}."
+  - "Port `5432`, or your custom PostgreSQL port, is open and reachable from {{site.base_gateway}}."
+
+variables:
+  header_value:
+    value: $OPENAI_API_KEY
+    description: Your OpenAI API key
+  pgvector_host:
+    value: $PGVECTOR_HOST
+    description: The host where your pgvector-enabled PostgreSQL instance runs
+  pgvector_user:
+    value: $PGVECTOR_USER
+    description: Database user for pgvector
+  pgvector_password:
+    value: $PGVECTOR_PASSWORD
+    description: Database password for pgvector
+
+config:
+  embeddings:
+    auth:
+      header_name: Authorization
+      header_value: Bearer ${header_value}
+    model:
+      name: text-embedding-3-small
+      provider: openai
+  search:
+    threshold: 0.7
+  vectordb:
+    strategy: pgvector
+    distance_metric: cosine
+    threshold: 0.7
+    dimensions: 1024
+    pgvector:
+      host: ${pgvector_host}
+      port: 5432
+      database: kong-pgvector
+      user: ${pgvector_user}
+      password: ${pgvector_password}
+      ssl: false
+      ssl_required: false
+      ssl_verify: false
+      ssl_version: tlsv1_2
+      timeout: 5000
+  rules:
+    allow_responses:
+      - Troubleshooting networks and connectivity issues
+      - Managing cloud platforms (AWS, Azure, GCP)
+      - Security hardening and incident response strategies
+      - DevOps pipelines, automation, and observability
+      - Software engineering concepts and language syntax
+      - IT governance, compliance, and regulatory guidance
+      - Continuous integration and deployment practices
+      - Writing documentation and explaining technical concepts
+      - Operating system administration and configuration
+      - Best practices for collaboration and productivity tools
+    deny_responses:
+      - Unauthorized penetration testing or hacking tutorials
+      - Methods for bypassing software licensing or DRM
+      - Step-by-step instructions for exploiting vulnerabilities
+      - Techniques to evade or disable security controls
+      - Collecting or exposing personal or employee data
+      - Using AI for impersonation, phishing, or fraud
+      - Manipulative social engineering techniques
+      - Advice on breaking internal IT or security policies
+      - Entertainment, dating, or other non-work topics
+      - Political, religious, or otherwise sensitive discussions unrelated to work
+
+tools:
+  - deck
+  - admin-api
+  - konnect-api
+  - kic
+  - terraform
diff --git a/app/_kong_plugins/ai-semantic-response-guard/examples/allow-and-deny-responses-redis.yaml b/app/_kong_plugins/ai-semantic-response-guard/examples/allow-and-deny-responses-redis.yaml
@@ -0,0 +1,75 @@
+description: Block or allow LLM responses based on semantic similarity to defined rules.
+
+extended_description: |
+  The AI Semantic Response Guard plugin analyzes the full response from an LLM service and filters it
+  based on semantic similarity to configured allow or deny patterns.
+
+  Deny rules take precedence over allow rules. Responses matching a deny pattern are blocked,
+  even if they also match an allow pattern. Responses not matching any allow pattern are blocked
+  when allow rules are set.
+
+title: 'Allow and deny responses using Redis as a vector database'
+
+weight: 900
+
+requirements:
+  - "[AI Proxy plugin](/plugins/ai-proxy/) or [AI Proxy Advanced plugin](/plugins/ai-proxy-advanced/) configured with an LLM service."
+  - "A [Redis](https://redis.io/docs/latest/) instance or another supported vector database."
+  - "Port `6379`, or your custom Redis port, is open and reachable from {{site.base_gateway}}."
+
+variables:
+  header_value:
+    value: $OPENAI_API_KEY
+    description: Your OpenAI API key
+  redis_host:
+    value: $REDIS_HOST
+    description: The host where your Redis instance runs
+
+config:
+  embeddings:
+    auth:
+      header_name: Authorization
+      header_value: Bearer ${header_value}
+    model:
+      name: text-embedding-3-small
+      provider: openai
+  search:
+    threshold: 0.7
+  vectordb:
+    strategy: redis
+    distance_metric: cosine
+    threshold: 0.7
+    dimensions: 1024
+    redis:
+      host: ${redis_host}
+      port: 6379
+  rules:
+    allow_responses:
+      - Network troubleshooting and diagnostics
+      - Cloud infrastructure management (AWS, Azure, GCP)
+      - Cybersecurity best practices and incident response
+      - DevOps workflows and automation
+      - Programming concepts and language usage
+      - IT policy and compliance guidance
+      - Software development lifecycle and CI/CD
+      - Documentation writing and technical explanation
+      - System administration and configuration
+      - Productivity and collaboration tools usage
+    deny_responses:
+      - Hacking techniques or penetration testing without authorization
+      - Bypassing software licensing or digital rights management
+      - Instructions on exploiting vulnerabilities or writing malware
+      - Circumventing security controls or access restrictions
+      - Gathering personal or confidential employee information
+      - Using AI to impersonate or phish others
+      - Social engineering tactics or manipulation techniques
+      - Guidance on violating company IT policies
+      - Content unrelated to work, such as entertainment or dating
+      - Political, religious, or sensitive non-work-related discussions
+
+tools:
+  - deck
+  - admin-api
+  - konnect-api
+  - kic
+  - terraform
diff --git a/app/_kong_plugins/ai-semantic-response-guard/examples/allow-responses.yaml b/app/_kong_plugins/ai-semantic-response-guard/examples/allow-responses.yaml
@@ -0,0 +1,62 @@
+description: Allow only specific LLM responses based on semantic similarity to defined rules.
+
+extended_description: |
+  The AI Semantic Response Guard plugin analyzes the full response from an LLM service and permits it
+  only if it semantically matches one of the configured allow patterns.
+
+  If a response does not match any of the allow patterns, it is blocked with a 400 Bad Request.
+
+title: 'Allow only responses'
+
+weight: 900
+
+requirements:
+  - "[AI Proxy plugin](/plugins/ai-proxy/) or [AI Proxy Advanced plugin](/plugins/ai-proxy-advanced/) configured with an LLM service."
+  - "A [Redis](https://redis.io/docs/latest/) instance or another supported vector database."
+  - "Port `6379`, or your custom Redis port, is open and reachable from {{site.base_gateway}}."
+
+variables:
+  header_value:
+    value: $OPENAI_API_KEY
+    description: Your OpenAI API key
+  redis_host:
+    value: $REDIS_HOST
+    description: The host where your Redis instance runs
+
+config:
+  embeddings:
+    auth:
+      header_name: Authorization
+      header_value: Bearer ${header_value}
+    model:
+      name: text-embedding-3-small
+      provider: openai
+  search:
+    threshold: 0.7
+  vectordb:
+    strategy: redis
+    distance_metric: cosine
+    threshold: 0.7
+    dimensions: 1024
+    redis:
+      host: ${redis_host}
+      port: 6379
+  rules:
+    allow_responses:
+      - Network troubleshooting and diagnostics
+      - Cloud infrastructure management (AWS, Azure, GCP)
+      - Cybersecurity best practices and incident response
+      - DevOps workflows and automation
+      - Programming concepts and language usage
+      - IT policy and compliance guidance
+      - Software development lifecycle and CI/CD
+      - Documentation writing and technical explanation
+      - System administration and configuration
+      - Productivity and collaboration tools usage
+
+tools:
+  - deck
+  - admin-api
+  - konnect-api
+  - kic
+  - terraform
diff --git a/app/_kong_plugins/ai-semantic-response-guard/examples/deny-responses.yaml b/app/_kong_plugins/ai-semantic-response-guard/examples/deny-responses.yaml
@@ -0,0 +1,62 @@
+description: Block specific LLM responses based on semantic similarity to defined rules.
+
+extended_description: |
+  The AI Semantic Response Guard plugin analyzes the full response from an LLM service and blocks it
+  if it semantically matches one of the configured deny patterns.
+
+  Responses that do not match any deny pattern are permitted.
+
+title: 'Deny only responses'
+
+weight: 900
+
+requirements:
+  - "[AI Proxy plugin](/plugins/ai-proxy/) or [AI Proxy Advanced plugin](/plugins/ai-proxy-advanced/) configured with an LLM service."
+  - "A [Redis](https://redis.io/docs/latest/) instance or another supported vector database."
+  - "Port `6379`, or your custom Redis port, is open and reachable from {{site.base_gateway}}."
+
+variables:
+  header_value:
+    value: $OPENAI_API_KEY
+    description: Your OpenAI API key
+  redis_host:
+    value: $REDIS_HOST
+    description: The host where your Redis instance runs
+
+config:
+  embeddings:
+    auth:
+      header_name: Authorization
+      header_value: Bearer ${header_value}
+    model:
+      name: text-embedding-3-small
+      provider: openai
+  search:
+    threshold: 0.7
+  vectordb:
+    strategy: redis
+    distance_metric: cosine
+    threshold: 0.7
+    dimensions: 1024
+    redis:
+      host: ${redis_host}
+      port: 6379
+  rules:
+    deny_responses:
+      - Hacking techniques or penetration testing without authorization
+      - Bypassing software licensing or digital rights management
+      - Instructions on exploiting vulnerabilities or writing malware
+      - Circumventing security controls or access restrictions
+      - Gathering personal or confidential employee information
+      - Using AI to impersonate or phish others
+      - Social engineering tactics or manipulation techniques
+      - Guidance on violating company IT policies
+      - Content unrelated to work, such as entertainment or dating
+      - Political, religious, or sensitive non-work-related discussions
+
+tools:
+  - deck
+  - admin-api
+  - konnect-api
+  - kic
+  - terraform
diff --git a/app/_kong_plugins/ai-semantic-response-guard/index.md b/app/_kong_plugins/ai-semantic-response-guard/index.md
@@ -0,0 +1,62 @@
+---
+title: 'AI Semantic Response Guard'
+name: 'AI Semantic Response Guard'
+
+content_type: plugin
+tier: ai_gateway_enterprise
+
+publisher: kong-inc
+description: 'Permit or block prompts based on semantic similarity to known LLM responses, preventing misuse of llm/v1/chat or llm/v1/completions requests'
+
+products:
+    - gateway
+    - ai-gateway
+
+works_on:
+    - on-prem
+    - konnect
+
+min_version:
+    gateway: '3.12'
+
+topologies:
+  on_prem:
+    - hybrid
+    - db-less
+    - traditional
+  konnect_deployments:
+    - hybrid
+    - cloud-gateways
+    - serverless
+
+icon: ai-semantic-response-guard.png # e.g. acme.svg or acme.png
+
+tags:
+    - ai
+---
+
+The AI Semantic Response Guard plugin extends the AI Prompt Guard plugin by filtering LLM responses based on semantic similarity to predefined rules. It helps prevent unwanted or unsafe responses when serving `llm/v1/chat`, `llm/v1/completions`, or `llm/v1/embeddings` requests through Kong AI Gateway.
+
+You can use a combination of `allow` and `deny` response rules to maintain integrity and compliance when returning responses from an LLM service.
+
+## How it works
+
+The plugin analyzes the semantic content of the full LLM response before it is returned to the client. The matching behavior is as follows:
+
+* If any `deny_responses` are set and the response matches a pattern in the deny list, the response is blocked with a `400 Bad Request`.
+* If any `allow_responses` are set, but the response matches none of the allowed patterns, the response is also blocked with a `400 Bad Request`.
+* If any `allow_responses` are set and the response matches one of the allowed patterns, the response is permitted.
+* If both `deny_responses` and `allow_responses` are set, the `deny` condition takes precedence. A response that matches a deny pattern will be blocked, even if it also matches an allow pattern. If the response does not match any deny pattern, it must still match an allow pattern to be permitted.
+
+## Response processing
+
+To enforce these rules, the plugin:
+
+1. **Disables streaming** (`stream=false`) to ensure the full response body is buffered before analysis.
+2. **Intercepts the response body** using the `guard-response` filter.
+3. **Extracts response text**, supporting JSON parsing of multiple LLM formats and gzipped content.
+4. **Generates embeddings** for the extracted text.
+5. **Searches the vector database** (Redis, Pgvector, or other) against configured `allow_responses` or `deny_responses`.
+6. **Applies the decision rules** described above.
+
+If a response is blocked or if a system error occurs during evaluation, the plugin returns a `400 Bad Request` to the client without exposing that the Semantic Response Guard blocked it.
diff --git a/app/_kong_plugins/ai-semantic-response-guard/reference.md b/app/_kong_plugins/ai-semantic-response-guard/reference.md
@@ -0,0 +1,3 @@
+---
+content_type: reference
+---
diff --git a/app/_landing_pages/ai-gateway.yaml b/app/_landing_pages/ai-gateway.yaml
@@ -431,6 +431,10 @@ rows:
         - type: plugin
           config:
             slug: ai-aws-guardrails
+      - blocks:
+        - type: plugin
+          config:
+            slug: ai-semantic-response-guard
       - blocks:
         - type: card
           config:
diff --git a/app/assets/icons/plugins/ai-semantic-response-guard.png b/app/assets/icons/plugins/ai-semantic-response-guard.png

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+---`
	`2`	`+content_type: reference`
	`3`	`+---`