Online Keyframe Selection Scheme for Semantic Video Segmentation

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we propose an online keyframe selection scheme for video analysis tasks based on deep reinforcement learning. Optimal keyframe selection is a crucial job for preserving the temporal updates in the video sequence since it affects the accuracy and throughput of the whole video analysis framework to a great degree. Our scheme is generic and can be used for the simultaneous decision of keyframe in any real-time video based task. We employ policy gradient reinforcement strategy to learn policy function for maximizing the expected reward. The proposed selection network has two actions (key and non-key) in the action space. State information is derived from the element-wise difference image of keyframe and the current frame. We adopt deep feature flow approach for feature propagation between keyframe and the current frame. We evaluate our scheme on the Cityscapes dataset with DeepLabv3 as segmentation network and LiteFlowNet for computing flow fields.

Original languageEnglish
Title of host publication2020 IEEE International Conference on Consumer Electronics - Asia, ICCE-Asia 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728161648
DOIs
StatePublished - 1 Nov 2020
Externally publishedYes
Event2020 IEEE International Conference on Consumer Electronics - Asia, ICCE-Asia 2020 - Seoul, Korea, Republic of
Duration: 1 Nov 20203 Nov 2020

Publication series

Name2020 IEEE International Conference on Consumer Electronics - Asia, ICCE-Asia 2020

Conference

Conference2020 IEEE International Conference on Consumer Electronics - Asia, ICCE-Asia 2020
Country/TerritoryKorea, Republic of
CitySeoul
Period1/11/203/11/20

Keywords

  • deep reinforcement learning
  • keyframe selection scheme
  • semantic video segmentation

Fingerprint

Dive into the research topics of 'Online Keyframe Selection Scheme for Semantic Video Segmentation'. Together they form a unique fingerprint.

Cite this