Skip to content

Conversation

nikita-savelyevv
Copy link
Collaborator

@nikita-savelyevv nikita-savelyevv commented Aug 28, 2025

Changes

As in the title.

Reason for changes

This PR reduces memory footprint when applying Fast Bias Correction algorithm: collecting raw activations is not required to obtain their shapes. Avoiding using raw reducers allows to save some memory otherwise allocated for the activations.

Example quantization run on vision encoder from OpenGVLab/InternVL2-1B with 4 calibration data samples:

Before After
system_memory_usage_from-zero system_memory_usage_from-zero

Since there is no need to allocate so much memory, statistics collection time also improves.

Related tickets

172800

Tests

Existing tests cover the new changes.

https://ci-adas-icv.iotg.sclab.intel.com/view/all/job/NNCF/job/manual/job/post_training_quantization/714/artifact/results.html

@github-actions github-actions bot added the NNCF Common Pull request that updates NNCF Common label Aug 28, 2025
@nikita-savelyevv nikita-savelyevv marked this pull request as ready for review August 28, 2025 15:28
@nikita-savelyevv nikita-savelyevv requested a review from a team as a code owner August 28, 2025 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NNCF Common Pull request that updates NNCF Common
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants