TY - GEN
T1 - Work-in-progress
T2 - 2019 International Conference on Compliers, Architectures and Synthesis for Embedded Systems, CASES 2019
AU - Lee, Kwangbae
AU - Kim, Hoseung
AU - Lee, Hayun
AU - Shin, Dongkun
N1 - Publisher Copyright:
© 2019 Association for Computing Machinery.
PY - 2019/10/13
Y1 - 2019/10/13
N2 - Network pruning is a promising compression technique to reduce computation and memory access cost of deep neural networks. In this paper, we propose a novel group-level pruning method to accelerate deep neural networks on mobile GPUs, where several adjacent weights are pruned in a group while providing high accuracy. Although several group-level pruning techniques have been proposed, the previous techniques can not achieve the desired accuracy at high sparsity. In this paper, we propose a unaligned approach to improve the accuracy of compressed model.
AB - Network pruning is a promising compression technique to reduce computation and memory access cost of deep neural networks. In this paper, we propose a novel group-level pruning method to accelerate deep neural networks on mobile GPUs, where several adjacent weights are pruned in a group while providing high accuracy. Although several group-level pruning techniques have been proposed, the previous techniques can not achieve the desired accuracy at high sparsity. In this paper, we propose a unaligned approach to improve the accuracy of compressed model.
UR - https://www.scopus.com/pages/publications/85077344386
U2 - 10.1145/3349569.3351537
DO - 10.1145/3349569.3351537
M3 - Conference contribution
AN - SCOPUS:85077344386
T3 - Proceedings of the International Conference on Compliers, Architectures and Synthesis for Embedded Systems Companion, CASES 2019
BT - Proceedings of the International Conference on Compliers, Architectures and Synthesis for Embedded Systems Companion, CASES 2019
PB - Association for Computing Machinery, Inc
Y2 - 13 October 2019 through 18 October 2019
ER -