Depth-Relative Self Attention for Monocular Depth Estimation

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Monocular depth estimation is very challenging because clues to the exact depth are incomplete in a single RGB image. To overcome the limitation, deep neural networks rely on various visual hints such as size, shade, and texture extracted from RGB information. However, we observe that if such hints are overly exploited, the network can be biased on RGB information without considering the comprehensive view. We propose a novel depth estimation model named RElative Depth Transformer (RED-T) that uses relative depth as guidance in self-attention. Specifically, the model assigns high attention weights to pixels of close depth and low attention weights to pixels of distant depth. As a result, the features of similar depth can become more likely to each other and thus less prone to misused visual hints. We show that the proposed model achieves competitive results in monocular depth estimation benchmarks and is less biased to RGB information. In addition, we propose a novel monocular depth estimation benchmark that limits the observable depth range during training in order to evaluate the robustness of the model for unseen depths.

Original languageEnglish
Title of host publicationProceedings of the 32nd International Joint Conference on Artificial Intelligence, IJCAI 2023
EditorsEdith Elkind
PublisherInternational Joint Conferences on Artificial Intelligence
Pages1396-1404
Number of pages9
ISBN (Electronic)9781956792034
DOIs
StatePublished - 2023
Externally publishedYes
Event32nd International Joint Conference on Artificial Intelligence, IJCAI 2023 - Macao, China
Duration: 19 Aug 202325 Aug 2023

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
Volume2023-August
ISSN (Print)1045-0823

Conference

Conference32nd International Joint Conference on Artificial Intelligence, IJCAI 2023
Country/TerritoryChina
CityMacao
Period19/08/2325/08/23

Fingerprint

Dive into the research topics of 'Depth-Relative Self Attention for Monocular Depth Estimation'. Together they form a unique fingerprint.

Cite this