TY - GEN
T1 - Acceleration of DNN-Based Video Object Detection Using Temporal Dependency of the Object Size
AU - Yoo, Jeong Yeop
AU - Ko, Jong Hwan
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Many studies have been proposed to improve the accuracy of Deep Neural Network based object detection. Some of them bring an increase in computation, which is a problem in tasks such as autonomous driving that requires high accuracy and low latency. Feature Pyramid Network (FPN) is a structure commonly used in improving the accuracy of object detection. However, it slows down the inference speed because of the high computation. To accelerate the inference while maintaining the accuracy of FPN, this paper proposes dynamic acceleration of object detection with FPN using temporal dependency of object sizes in a video. We modify FPN to have faster inference speed when targeting certain object sizes. By using the previous object sizes, the target object size is determined. The modified FPN is used in a dynamic manner, which speeds up the inference. In this method, we achieve 20.9% faster inference at the cost of a 0.06 mAP drop on the ImageNet VID validation dataset.
AB - Many studies have been proposed to improve the accuracy of Deep Neural Network based object detection. Some of them bring an increase in computation, which is a problem in tasks such as autonomous driving that requires high accuracy and low latency. Feature Pyramid Network (FPN) is a structure commonly used in improving the accuracy of object detection. However, it slows down the inference speed because of the high computation. To accelerate the inference while maintaining the accuracy of FPN, this paper proposes dynamic acceleration of object detection with FPN using temporal dependency of object sizes in a video. We modify FPN to have faster inference speed when targeting certain object sizes. By using the previous object sizes, the target object size is determined. The modified FPN is used in a dynamic manner, which speeds up the inference. In this method, we achieve 20.9% faster inference at the cost of a 0.06 mAP drop on the ImageNet VID validation dataset.
KW - Deep Learning Acceleration
KW - Object Detection Acceleration
KW - Video Object Detection
UR - https://www.scopus.com/pages/publications/85122940659
U2 - 10.1109/ICTC52510.2021.9620830
DO - 10.1109/ICTC52510.2021.9620830
M3 - Conference contribution
AN - SCOPUS:85122940659
T3 - International Conference on ICT Convergence
SP - 1182
EP - 1184
BT - ICTC 2021 - 12th International Conference on ICT Convergence
PB - IEEE Computer Society
T2 - 12th International Conference on Information and Communication Technology Convergence, ICTC 2021
Y2 - 20 October 2021 through 22 October 2021
ER -