Contrastive Feature Bin Loss for Monocular Depth Estimation

  • Song, Jihun
  • Hyun, Yoonsuk
Citations

WEB OF SCIENCE

1
Citations

SCOPUS

1

초록

Recently monocular depth estimation has achieved notable performance using encoder-decoder-based models. These models have utilized the Scale-Invariant Logarithmic (SILog) loss for effective training, leading to significant performance improvements. However, since the SILog loss is designed to reduce error variance, it may potentially mislead the model. To address this problem, we propose the Contrastive Feature Bin (CFB) loss as an additional regularization loss. CFB loss prevents the possibility of incorrect learning by ensuring that similar depths are learned similarly, and can be easily integrated into various encoder-decoder-based models and greatly enhances overall performance. Another problem commonly faced by existing monocular depth estimation models is that they sometimes demand a significant amount of memory resources during training. Nevertheless, reducing memory consumption by employing smaller batch sizes can result in a noticeable decline in performance, compromising reproducibility and practicality. CFB loss allows encoder-decoder-based models to achieve comparable or even superior performance with lower batch sizes, requiring only modest increases in training time. Our proposed approach demonstrates improvements in the performance of diverse monocular depth estimation models on datasets such as NYU Depth v2 and KITTI Eigen split. Notably, in scenarios with a small batch size, it achieves up to an 11% improvement in RMSE compared to existing methods. The code is available at Github.

키워드

Monocular depth estimationcontrastive learningmemory efficient trainingMonocular depth estimationcontrastive learningmemory efficient training
제목
Contrastive Feature Bin Loss for Monocular Depth Estimation
저자
Song, JihunHyun, Yoonsuk
DOI
10.1109/ACCESS.2025.3551435
발행일
2025
유형
Article
저널명
IEEE Access
13
페이지
49584 ~ 49596