You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the current solution, we only use a heap to select the right process,
but resort to linear search for selecting a member within a process.
This means use cases where a lot of threads run within the same process
can yield slow assignment. The number of threads in a process shouldn’t
scale arbitrarily (our assumed case for benchmarking of 50 threads in a
single process seems quite extreme already), however, we can optimize
for this case to reduce the runtime further.
Other assignment algorithms assign directly on the member-level, but we
cannot do this in Kafka Streams, since we cannot assign tasks to
processes that already own the task. Defining a heap directly on members
would mean that we may have to skip through 10s of member before finding
one that does not belong to a process that does not yet own the member.
Instead, we can define a separate heap for each process, which keeps the
members of the process by load. We can only keep the heap as long as we
are only changing the load of the top-most member (which we usually do).
This means we keep track of a lot of heaps, but since heaps are backed
by arrays in Java, this should not result in extreme memory
inefficiencies.
In our worst-performing benchmark, this improves the runtime by ~2x on
top of the optimization above.
Also piggybacked are some minor optimizations / clean-ups: -
initialize HashMaps and ArrayLists with the right capacity - fix some
comments - improve logging output
Note that this is a pure performance change, so there are no changes to
the unit tests.
Reviewers: Bill Bejeck<[email protected]>
Copy file name to clipboardExpand all lines: group-coordinator/src/main/java/org/apache/kafka/coordinator/group/streams/assignor/AssignmentMemberSpec.java
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -27,9 +27,9 @@
27
27
*
28
28
* @param instanceId The instance ID if provided.
29
29
* @param rackId The rack ID if provided.
30
-
* @param activeTasks Reconciled active tasks
31
-
* @param standbyTasks Reconciled standby tasks
32
-
* @param warmupTasks Reconciled warm-up tasks
30
+
* @param activeTasks Current target active tasks
31
+
* @param standbyTasks Current target standby tasks
32
+
* @param warmupTasks Current target warm-up tasks
33
33
* @param processId The process ID.
34
34
* @param clientTags The client tags for a rack-aware assignment.
35
35
* @param taskOffsets The last received cumulative task offsets of assigned tasks or dormant tasks.
Copy file name to clipboardExpand all lines: group-coordinator/src/main/java/org/apache/kafka/coordinator/group/streams/assignor/StickyTaskAssignor.java
0 commit comments