Filter-Wise Quantization of Deep Neural Networks for IoT Devices

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Network quantization is an effective compression technique of deep neural networks (DNNs) for on-device machine learning at consumer devices. Existing layer-wise quantization techniques allocate different bitwidths to different network layers. In this paper, we propose a filter-wise quantization technique based on the differentiable neural architecture search (DNAS). We use a two-level network structure and a novel candidate generation algorithm, which can substantially prune the large search space. The effectiveness of our technique was validated with MobileNetV2 on ImageNet.

Original languageEnglish
Title of host publication2021 IEEE International Conference on Consumer Electronics, ICCE 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728197661
DOIs
StatePublished - 10 Jan 2021
Event2021 IEEE International Conference on Consumer Electronics, ICCE 2021 - Las Vegas, United States
Duration: 10 Jan 202112 Jan 2021

Publication series

NameDigest of Technical Papers - IEEE International Conference on Consumer Electronics
Volume2021-January
ISSN (Print)0747-668X

Conference

Conference2021 IEEE International Conference on Consumer Electronics, ICCE 2021
Country/TerritoryUnited States
CityLas Vegas
Period10/01/2112/01/21

Fingerprint

Dive into the research topics of 'Filter-Wise Quantization of Deep Neural Networks for IoT Devices'. Together they form a unique fingerprint.

Cite this