TY - GEN
T1 - Accelerating CNN via Dynamic Pattern-based Pruning Network
AU - Lee, Gwanghan
AU - Shin, Saebyeol
AU - Woo, Simon S.
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/10/17
Y1 - 2022/10/17
N2 - Recently, dynamic pruning methods have been actively researched, as they have shown very effective and remarkable performance in reducing computation complexity of deep neural networks. Nevertheless, most dynamic pruning methods fail to achieve actual acceleration due to the extra overheads caused by indexing and weight-copying to implement the dynamic sparse patterns for every input sample. To address this issue, we propose Dynamic Pattern-based Pruning Network (DPPNet), which preserves the advantages of both static and dynamic networks. First, our method statically prunes the weight kernel into various sparse patterns. Then, the dynamic convolution kernel is generated via aggregating input-dependent attention weights and static kernels. Unlike previous dynamic pruning methods, our novel method dynamically fuses static kernel patterns, enhancing the kernel's representational power without additional overhead. Moreover, our dynamic sparse pattern enables an efficient process using BLAS libraries, accomplishing actual acceleration. We demonstrate the effectiveness of the proposed DPPNet on CIFAR and ImageNet, outperforming the state-of-the-art methods achieving better accuracy with lower computational cost. For example, on ImageNet classification, ResNet34 utilizing DPP module achieves state-of-the-art performance with 65.6% FLOPs reduction and the inference speed increased by 35.9% without loss in accuracy. Code is available at https://github.com/lee-gwang/DPPNet.
AB - Recently, dynamic pruning methods have been actively researched, as they have shown very effective and remarkable performance in reducing computation complexity of deep neural networks. Nevertheless, most dynamic pruning methods fail to achieve actual acceleration due to the extra overheads caused by indexing and weight-copying to implement the dynamic sparse patterns for every input sample. To address this issue, we propose Dynamic Pattern-based Pruning Network (DPPNet), which preserves the advantages of both static and dynamic networks. First, our method statically prunes the weight kernel into various sparse patterns. Then, the dynamic convolution kernel is generated via aggregating input-dependent attention weights and static kernels. Unlike previous dynamic pruning methods, our novel method dynamically fuses static kernel patterns, enhancing the kernel's representational power without additional overhead. Moreover, our dynamic sparse pattern enables an efficient process using BLAS libraries, accomplishing actual acceleration. We demonstrate the effectiveness of the proposed DPPNet on CIFAR and ImageNet, outperforming the state-of-the-art methods achieving better accuracy with lower computational cost. For example, on ImageNet classification, ResNet34 utilizing DPP module achieves state-of-the-art performance with 65.6% FLOPs reduction and the inference speed increased by 35.9% without loss in accuracy. Code is available at https://github.com/lee-gwang/DPPNet.
KW - model compression
KW - neural network pruning
KW - structured pruning
UR - https://www.scopus.com/pages/publications/85140832744
U2 - 10.1145/3511808.3557225
DO - 10.1145/3511808.3557225
M3 - Conference contribution
AN - SCOPUS:85140832744
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 1034
EP - 1043
BT - CIKM 2022 - Proceedings of the 31st ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 31st ACM International Conference on Information and Knowledge Management, CIKM 2022
Y2 - 17 October 2022 through 21 October 2022
ER -