-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Closed
Labels
Description
The Feature
https://docs.aws.amazon.com/bedrock/latest/userguide/latency-optimized-inference.html
Motivation, pitch
This feature decreases the latency of a couple of models:
- Anthropic Claude 3.5 Haiku | us.anthropic.claude-3-5-haiku-20241022-v1:0 | US East (Ohio)
- Meta Llama 3.1 70B Instruct | us.meta.llama3-1-70b-instruct-v1:0 | US East (Ohio)
- Llama 3.1 405B Instruct
Are you a ML Ops Team?
No
Twitter / LinkedIn details
No response