Skip to content
/ MDP Public

[CVPR 2025] MDP: Multidimensional Vision Model Pruning with Latency Constraint

License

Apache-2.0, Apache-2.0 licenses found

Licenses found

Apache-2.0
LICENSE
Apache-2.0
THIRD_PARTY_LICENSES
Notifications You must be signed in to change notification settings

NVlabs/MDP

🎯 MDP: Multidimensional Vision Model Pruning with Latency Constraint

arXiv Conference Website Tutorial

This repository contains the official implementation of MDP method introduced in our CVPR 2025 paper:

MDP: Multidimensional Vision Model Pruning with Latency Constraint
Xinglong Sun, Barath Lakshmanan, Maying Shen, Shiyi Lan, Jingde Chen, Jose M. Alvarez

📋 Table of Contents

📄 License

Please check the LICENSE file. HALP may be used non-commercially, meaning for research or evaluation purposes only. For business inquiries, please contact [email protected].

📢 News

  • [2025/09] Release license obtained. ResNet50 and ablation study code are now available; remaining code will be cleaned up and released soon.
  • [2025/06] I presented MDP in a CVPR 2025 tutorial on Full-Stack, GPU-based Acceleration of Deep Learning and Foundation Models. You can watch the tutorial video here!

📝 Introduction

Current structural pruning methods face two significant limitations:

  1. They often limit pruning to finer-grained levels like channels, making aggressive parameter reduction challenging
  2. They focus heavily on parameter and FLOP reduction, with existing latency-aware methods frequently relying on simplistic, suboptimal linear models that fail to generalize well to transformers

In this paper, we address both limitations by introducing Multi-Dimensional Pruning (MDP), a novel paradigm that:

  • Jointly optimizes across various pruning granularities (channels, query, key, heads, embeddings, and blocks)
  • Employs advanced latency modeling to accurately capture latency variations
  • Reformulates pruning as a Mixed-Integer Nonlinear Program (MINLP)
  • Supports both CNNs and transformers

🎨 Framework

Overview of our MDP method

📊 Results

Our extensive experiments demonstrate MDP's superior performance:

ImageNet Classification

  • ResNet50: 28% speed increase with +1.4 Top-1 accuracy improvement over prior art
  • DEIT-Base: 37% additional acceleration with +0.7 Top-1 accuracy improvement over prior art

3D Object Detection

  • Higher speed (×1.18) and mAP (0.451 vs. 0.449) compared to dense baseline

MDP exhibits Pareto dominance with both CNNs and Transformers across various tasks

🚀 Installation

Please check README within the folder for the task you want to run!

💻 Usage

Please check README within the folder for the task you want to run!

🙏 Acknowledgements

Some of the infrastructure, data loading, and foundational code are adapted from HALP and NVIT works. We sincerely thank the authors of these works for their contributions.

📚 Citation

If you find this repository useful for your research, please cite our paper:

@misc{sun2025mdpmultidimensionalvisionmodel,
    title={MDP: Multidimensional Vision Model Pruning with Latency Constraint},
    author={Xinglong Sun and Barath Lakshmanan and Maying Shen and Shiyi Lan and Jingde Chen and Jose M. Alvarez},
    year={2025},
    eprint={2504.02168},
    archivePrefix={arXiv},
    primaryClass={cs.CV},
    url={https://arxiv.org/abs/2504.02168}
}

About

[CVPR 2025] MDP: Multidimensional Vision Model Pruning with Latency Constraint

Resources

License

Apache-2.0, Apache-2.0 licenses found

Licenses found

Apache-2.0
LICENSE
Apache-2.0
THIRD_PARTY_LICENSES

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages