Client refactor #220

declark1 · 2024-10-01T21:19:14Z

This PR is a consolidation of several changes related to clients.

Refactors clients, including splitting detector implementations into standalone clients; see Refactor clients to better align with requirements #217 for additional details.
- TextContextChatDetectorClient is a placeholder and not fully implemented yet
Adds OpenAiClient with chat completions and completions support
Adds GenerationProvider::OpenAi variant
Adds ChatGenerationConfig (OrchestratorConfig.chat_generation field)
Adds DetectorType (DetectorConfig.type field)
Requires that only the hostname (as opposed to url) is set for client services (ServiceConfig.hostname field) and adds hostname validation

Health check related tweaks / simplifications

The /info response structure is changed as clients are no longer grouped by type. Example of new format with all changes:

{
    "services": {
        "generation": {
  	      "status": "HEALTHY"
        },
        "chunker1": {
  	      "status": "HEALTHY"
        },
        "detector1": {
  	      "status": "UNKNOWN",
  	      "code": 404,
  	      "reason": "Not Found"
        },
        "detector2": {
  	      "status": "UNKNOWN",
  	      "code": 404,
  	      "reason": "Not Found"
        }
    }
}

Moved clients::Error and HttpClient and related items out of clients.rs to separate files

Closes #217
Closes #165
Closes #142
Closes #191
Closes #194
Closes #198

Signed-off-by: declark1 <[email protected]>

…rom tls config Co-authored-by: Mateus Devino <[email protected]> Signed-off-by: declark1 <[email protected]>

Signed-off-by: Mateus Devino <[email protected]>

Signed-off-by: declark1 <[email protected]>

pscoro

Overall LGTM. I had one idea for a potential improvement that I was digging into on Friday but I think it may be over-engineered, I'll mention it anyway:

Every client implementor seems to have exactly one "inner client" thats either a tonic or http client. The tonic grpc clients need to be cloned in every handler function, and the http ones are accessed by reference. We could potentially add an associated Inner type and a client() to access it in the Client trait. For clients with an HTTP inner client, this is implemented to return &HttpClient and for tonic a clone of the client, e.g. NlpServiceClient<LoadBalancedChannel>. This on its own doesn't sound too bad but the catch is that it will introduce the need for AnyClient, because the ClientMap:

pub struct ClientMap(HashMap<String, Box<dyn Client>>);

will now need the client to specify the Inner type (e.g. Client<Inner = &HttpClient>), so instead we can use something like:

pub struct ClientMap(HashMap<String, Box<dyn AnyClient>>);

Setting up this type-erased blanket-implemented trait is doable and there may be more use cases for it in the future but for now I think it may just add more confusing code to the project without adding much value, all it would really add as benefit for now is that any client just needs to call client() to get their inner client, and in the case of grpc this means we could get rid of the clone line thats at the top of every handler.

src/clients/openai.rs

…e OpenAI-specific items, drop Completions API. Signed-off-by: declark1 <[email protected]>

gkumbhat

Nice refactor Dan. Left a few comments, suggestions and questions

docs/api/orchestrator_openapi_0_1_0.yaml

docs/architecture/adrs/006-detector-type.md

src/clients/generation.rs

src/config.rs

gkumbhat · 2024-10-07T21:05:15Z

src/health.rs

-
 /// Health status determined for or returned by a client service.
 #[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
+#[serde(rename_all = "UPPERCASE")]


gkumbhat · 2024-10-07T21:09:38Z

src/orchestrator.rs

-                panic!("Unexpected error during client health probing: {}", e);
-            });
+            info!("Probing client health...");
+            let client_health = self.client_health(true).await;


should this be donw in a tokio task so that it doesn't block startup ? (we can address it in future, but just wanted to make a note here)

I assume Orchestrator::on_start_up() tasks, i.e. checks, should "complete" prior to startup, but I don't really have an opinion on this. @pscoro any thoughts on this?

On one side, its good to check if everything is wired up correctly, but then in case connections and this verification takes time, it will halt orchestrator bootup and look like orchestrator is having issue.

So I think we should make this async and put out error levels logs if this startup check fails for some downstreams.

(can be handled in separate PR though)

I do recall that we wanted to leverage health checks as an extension of config validation, so that we catch misconfigured services at start-up. I believe this was the original intention of running probes at start-up.

yep, thats accurate, but then we realized that this has a unwanted side-effect that it will make orchestrator's startup to fail when even 1 of the downstream is not responding. And so it looks like orchestrator is having issue (from looking at deployment), but the real problem is one of the detectors.

So the idea pivoted to orchestrator doing this check, but instead of blocking its own startup / boot, it will expose this information with API call.

I assume Orchestrator::on_start_up() tasks, i.e. checks, should "complete" prior to startup, but I don't really have an opinion on this. @pscoro any thoughts on this?

This brings up a good point, the naming of that function may have some legacy from when we used to want orchestrator to fatally error if the start up health check failed. Now since we decided that the orchestrator does not treat its clients as dependencies, there is no behaviour currently in that function that needs to be checked synchronously before really starting the orchestrator.

I think on_startup() should imply behaviour executed before start up, maybe in the future this could include some tasks that need to run synchronously. I think what we should do is run the call to client_health in a new tokio task thats started inside the on_start_up(). Thoughts on this?

I do recall that we wanted to leverage health checks as an extension of config validation, so that we catch misconfigured services at start-up. I believe this was the original intention of running probes at start-up.

This would still catch misconfig errors at start up time. Since we are already decided that the state of clients (as they are known via their configs) does not affect the liveliness of the orchestrator, and are only reported as non-fatal errors, I dont think there is functionally any difference between:

starting an async health check, then starting the orchestrator & recving requests, then handling and reporting the response of the health check without crashing on failure, and

starting a sync health check, waiting for the response, then handling and reporting the response of the health check without crashing on failure, and then starting the orchestrator

src/orchestrator.rs

Co-authored-by: Gaurav Kumbhat <[email protected]> Signed-off-by: Dan Clark <[email protected]>

…pe and example config Signed-off-by: declark1 <[email protected]>

…drop GenerationProvider::OpenAi variant Signed-off-by: declark1 <[email protected]>

mdevino

Changes look good to me!
Having detectors in separate files make it much easier to navigate the code.

I left a few comments around breaking down large functions to start a dicussion, but I'd suggest leaving these for a separate PR (assuming we agree on these changes).

src/clients/detector/text_context_doc.rs

src/clients/http.rs

src/config.rs

Signed-off-by: declark1 <[email protected]>

…alth_client to detector clients and OpenAiClient Co-authored-by: Paul Scoropan <[email protected]> Signed-off-by: declark1 <[email protected]>

Signed-off-by: declark1 <[email protected]>

gkumbhat

Looks good to me! Thanks for all the updates and aligning clients design with goals.

declark1 changed the title ~~Client refactor (DRAFT)~~ Client refactor Oct 2, 2024

declark1 force-pushed the client-refactor branch from adbc8f8 to 2d7935e Compare October 2, 2024 22:21

declark1 mentioned this pull request Oct 3, 2024

Add detector type to config #205

Closed

2 tasks

declark1 and others added 3 commits October 3, 2024 11:37

Add initial client refactor code (wip)

bb77f61

Signed-off-by: declark1 <[email protected]>

Add is_valid_hostname(), update hostname validation, infer protocol f…

3861b12

…rom tls config Co-authored-by: Mateus Devino <[email protected]> Signed-off-by: declark1 <[email protected]>

Add detector type ADR

3b9d4c3

Signed-off-by: Mateus Devino <[email protected]>

declark1 force-pushed the client-refactor branch from 3c12706 to 3e2bb07 Compare October 3, 2024 18:44

Rebase and add header passthrough to detectors, update tests

9495057

Signed-off-by: declark1 <[email protected]>

declark1 force-pushed the client-refactor branch from 3e2bb07 to 9495057 Compare October 3, 2024 18:49

Apply health check related tweaks

c27e37a

Signed-off-by: declark1 <[email protected]>

declark1 force-pushed the client-refactor branch from a086d5b to c27e37a Compare October 4, 2024 18:23

declark1 marked this pull request as ready for review October 4, 2024 18:59

declark1 requested review from gkumbhat and evaline-ju as code owners October 4, 2024 18:59

declark1 requested review from mdevino and pscoro October 4, 2024 19:00

Update openapi spec

ad1a562

Signed-off-by: declark1 <[email protected]>

declark1 force-pushed the client-refactor branch from 5f530a9 to ad1a562 Compare October 4, 2024 19:25

pscoro approved these changes Oct 7, 2024

View reviewed changes

evaline-ju reviewed Oct 7, 2024

View reviewed changes

src/clients/openai.rs Show resolved Hide resolved

Updates to align OpenAI Chat Completions with current spec and includ…

9a615ef

…e OpenAI-specific items, drop Completions API. Signed-off-by: declark1 <[email protected]>

declark1 force-pushed the client-refactor branch from 6816dd9 to 9a615ef Compare October 7, 2024 20:04

gkumbhat reviewed Oct 7, 2024

View reviewed changes

declark1 and others added 6 commits October 7, 2024 14:38

Update docs/architecture/adrs/006-detector-type.md

c19783b

Co-authored-by: Gaurav Kumbhat <[email protected]> Signed-off-by: Dan Clark <[email protected]>

Update docs/architecture/adrs/006-detector-type.md

975ee69

Co-authored-by: Gaurav Kumbhat <[email protected]> Signed-off-by: Dan Clark <[email protected]>

Update docs/architecture/adrs/006-detector-type.md

046e285

Co-authored-by: Gaurav Kumbhat <[email protected]> Signed-off-by: Dan Clark <[email protected]>

Update docs/architecture/adrs/006-detector-type.md

42d431b

Co-authored-by: Gaurav Kumbhat <[email protected]> Signed-off-by: Dan Clark <[email protected]>

Rename TextContextChatDetector to TextChatDetector, update DetectorTy…

4dd61fd

…pe and example config Signed-off-by: declark1 <[email protected]>

Drop provider from ChatGenerationConfig as it will always useopenai, …

e7b6226

…drop GenerationProvider::OpenAi variant Signed-off-by: declark1 <[email protected]>

mdevino approved these changes Oct 8, 2024

View reviewed changes

src/clients/detector/text_context_doc.rs Show resolved Hide resolved

src/clients/http.rs Show resolved Hide resolved

src/config.rs Outdated Show resolved Hide resolved

declark1 added 2 commits October 8, 2024 09:48

Split config validation rules into methods

355de7d

Signed-off-by: declark1 <[email protected]>

Add health_service to DetectorConfig and ChatGenerationConfig, add he…

ee1e782

…alth_client to detector clients and OpenAiClient Co-authored-by: Paul Scoropan <[email protected]> Signed-off-by: declark1 <[email protected]>

pscoro mentioned this pull request Oct 8, 2024

Client optional health port configuration #224

Closed

Move inner client creation back to client constructor

428c806

Signed-off-by: declark1 <[email protected]>

mdevino mentioned this pull request Oct 10, 2024

Add option to whether or not a detector implements a health endpoint #206

Closed

2 tasks

gkumbhat approved these changes Oct 10, 2024

View reviewed changes

gkumbhat merged commit b78edf0 into foundation-model-stack:main Oct 10, 2024
2 checks passed

gkumbhat deleted the client-refactor branch October 10, 2024 21:14

evaline-ju mentioned this pull request Oct 14, 2024

whole doc chunker not working with text_contents detectors #228

Closed

Client refactor #220

Client refactor #220

Uh oh!

Conversation

declark1 commented Oct 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pscoro left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gkumbhat left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gkumbhat Oct 7, 2024

Choose a reason for hiding this comment

Uh oh!

gkumbhat Oct 7, 2024

Choose a reason for hiding this comment

Uh oh!

declark1 Oct 7, 2024

Choose a reason for hiding this comment

Uh oh!

gkumbhat Oct 8, 2024

Choose a reason for hiding this comment

Uh oh!

declark1 Oct 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gkumbhat Oct 8, 2024

Choose a reason for hiding this comment

Uh oh!

pscoro Oct 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pscoro Oct 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mdevino left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gkumbhat left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

declark1 commented Oct 1, 2024 •

edited

Loading

declark1 Oct 8, 2024 •

edited

Loading

pscoro Oct 8, 2024 •

edited

Loading

pscoro Oct 8, 2024 •

edited

Loading