aws_kinesis input hitting GetRecords API rate limit

I have a stream that is getting about 50 messages per second per shard being processed fast enough that benthos is making more than 5 GetRecords calls per second (the AWS per shard rate limit). Despite what the Amazon documentation says, the aws go sdk's GetRecords function does not return a ProvisionedThroughputExceededException error (or any error) when the API rate limit is hit; it just returns 0 records, so this issue is not obvious from the client side but can be seen in the Cloudwatch metrics.

I have tried various permutations of batch periods and sleep durations in the aws_kinesis batch policy, but can't get it to work consistently to reduce the rate limit hit while maintaining throughput.

I believe the aws_kinesis input already does a back off when GetRecords receives zero messages, but in this scenario, the rate limit will already have been hit. 

One potential way to handle this is to add a configurable delay before the next GetRecords call, defaulting to no delay as this is not always necessary. Typically, adding a 200 - 250 ms delay should ensure that less than 5 GetRecords API calls are made per second for a shard, assuming we are the only consumer. This would increase latency a little bit but should not impact throughput. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

aws_kinesis input hitting GetRecords API rate limit #1209

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

aws_kinesis input hitting GetRecords API rate limit #1209

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions