Toward Smooth Depth Driven by Selective Attention and Selective Aggregation

Citations

SCOPUS

0

초록

The challenges in single-image depth prediction (SIDP) are mainly due to the lack of smooth depth ground truth and the presence of irregular and complex objects. While window-based attention mechanisms, which balance long-range dependency capture with computational efficiency by processing elements within a fixed grid, have advanced SIDP research, they are limited by a constrained search range. This limitation can impede smooth depth estimation in irregularity and complexity. To address these challenges, we propose a novel attention mechanism that selectively identifies and aggregates only the most relevant information. Our approach enables flexible and efficient exploration by using data-dependent movable offsets to select substantial tokens and designating them as key-value pairs. Furthermore, we overcome the issue of small softmax values in traditional attention mechanisms through score-based grouping with top-k selection. Our feed-forward network, which incorporates a gating mechanism and grouped convolutions with varying cardinalities, refines features before passing them to subsequent layers, allowing for targeted focus on input features. Finally, we utilize feature maps from hierarchical decoders to estimate bin centers and per-pixel probability distributions. We introduce a 4-way selective scanning technique to aggregate these perpixel probability distributions smoothly, resulting in a dense and continuous depth map. The proposed network, named selective attention and selective aggregate depth (SA2Depth), demonstrates state-of-the-art performance across multiple datasets compared to previous methods. © 1999-2012 IEEE.

키워드

Attention mechanismbin generation strategydeep learningmonocular depth estimationselective scanningtop-k selection
제목
Toward Smooth Depth Driven by Selective Attention and Selective Aggregation
저자
Park, Cheol-HoonAhn, Woo-JinChoi, Hyun-Duck
DOI
10.1109/TMM.2026.3660136
발행일
2026
유형
Article in press
저널명
IEEE Transactions on Multimedia