TY - GEN
T1 - Filter-Wise Quantization of Deep Neural Networks for IoT Devices
AU - Kim, Hoseung
AU - Jo, Geunhye
AU - Lee, Hayun
AU - Shin, Dongkun
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/1/10
Y1 - 2021/1/10
N2 - Network quantization is an effective compression technique of deep neural networks (DNNs) for on-device machine learning at consumer devices. Existing layer-wise quantization techniques allocate different bitwidths to different network layers. In this paper, we propose a filter-wise quantization technique based on the differentiable neural architecture search (DNAS). We use a two-level network structure and a novel candidate generation algorithm, which can substantially prune the large search space. The effectiveness of our technique was validated with MobileNetV2 on ImageNet.
AB - Network quantization is an effective compression technique of deep neural networks (DNNs) for on-device machine learning at consumer devices. Existing layer-wise quantization techniques allocate different bitwidths to different network layers. In this paper, we propose a filter-wise quantization technique based on the differentiable neural architecture search (DNAS). We use a two-level network structure and a novel candidate generation algorithm, which can substantially prune the large search space. The effectiveness of our technique was validated with MobileNetV2 on ImageNet.
UR - https://www.scopus.com/pages/publications/85106056923
U2 - 10.1109/ICCE50685.2021.9427656
DO - 10.1109/ICCE50685.2021.9427656
M3 - Conference contribution
AN - SCOPUS:85106056923
T3 - Digest of Technical Papers - IEEE International Conference on Consumer Electronics
BT - 2021 IEEE International Conference on Consumer Electronics, ICCE 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE International Conference on Consumer Electronics, ICCE 2021
Y2 - 10 January 2021 through 12 January 2021
ER -