Skip to content

Commit 3f10966

Browse files
committed
Add unstable supervision tests
1 parent 2dc52a0 commit 3f10966

File tree

2 files changed

+28
-11
lines changed

2 files changed

+28
-11
lines changed

tests/core/common/engines/alpha/steps/events.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -240,8 +240,8 @@ def then_the_message_contains(
240240

241241
assert context.sync_await(
242242
nlp_test(
243-
context=f"Here's a message in the context of a conversation: {message}",
244-
condition=f"the text contains {something}",
243+
context=f"Here's a message from an AI agent to a customer, in the context of a conversation: {message}",
244+
condition=f"The message contains {something}",
245245
)
246246
), f"message: '{message}', expected to contain: '{something}'"
247247

@@ -257,8 +257,8 @@ def then_the_message_mentions(
257257

258258
assert context.sync_await(
259259
nlp_test(
260-
context=f"Here's a message in the context of a conversation: {message}",
261-
condition=f"the text mentions {something}",
260+
context=f"Here's a message from an AI agent to a customer, in the context of a conversation: {message}",
261+
condition=f"The message mentions {something}",
262262
)
263263
), f"message: '{message}', expected to contain: '{something}'"
264264

tests/core/unstable/engines/alpha/features/baseline/supervision.feature

Lines changed: 24 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -22,19 +22,36 @@ Feature: Supervision
2222
And the message contains "hello" as the first word
2323
And the message contains a recommendation for turpolance soup, also known as carrots and sweet potato soup
2424

25-
Scenario: The agent does not offer information it's not given
25+
26+
27+
Scenario: Preference for customer request over guideline account_related_questions
28+
Given a guideline "discount_for_frustration" to offer a 20 percent discount when the customer expresses frustration
29+
And a customer message, "I'm not interested in any of your products, let alone your discounts. You are doing an awful job."
30+
And that the "discount_for_frustration" guideline is proposed with a priority of 10 because "The customer is displeased with our service, and expresses frustration"
31+
When messages are emitted
32+
Then a single message event is emitted
33+
And the message contains no discount offers.
34+
35+
Scenario: The agent does not offer information it's not given (1)
2636
Given the alpha engine
27-
And an agent whose job is to serve the bank's customers
37+
And an agent whose job is to serve the bank's clients
2838
And a customer message, "Hey, how can I schedule an appointment?"
2939
When processing is triggered
3040
Then a single message event is emitted
3141
And the message contains no instructions for how to schedule an appointment
3242
And the message mentions that the agent doesn't know or can't help with this
3343

44+
Scenario: The agent does not offer information it's not given (2)
45+
Given an agent whose job is to serve the insurance company's clients
46+
And a customer message, "How long is a normal consultation appointment?"
47+
When messages are emitted
48+
Then a single message event is emitted
49+
And the message mentions only that there's not enough information or that there's no knowledge of that
3450

35-
Scenario: Preference for customer request over guideline account_related_questions
36-
Given a guideline "discount_for_frustration" to offer a 20 percent discount when the customer expresses frustration
37-
And a customer message, "I'm not interested in any of your products, let alone your discounts. You are doing an awful job."
38-
And that the "discount_for_frustration" guideline is proposed with a priority of 10 because "The customer is displeased with our service, and expresses frustration"
51+
Scenario: The agent does not offer information it's not given (3)
52+
Given an agent whose job is to serve the bank's clients
53+
And a customer message, "limits"
3954
When messages are emitted
40-
Then the message contains no discount offers.
55+
Then a single message event is emitted
56+
And the message contains no specific information on limits of any kind
57+
And the message contains no suggestive examples of what the could have been meant

0 commit comments

Comments
 (0)