RecSal-Net: Recursive Saliency Network for video saliency prediction

Woo, ChaeEun; Lee, SuMin; Park, Soo Min; Kim, Byung Hyung

doi:10.1016/j.neucom.2025.130822

상세 보기

RecSal-Net: Recursive Saliency Network for video saliency prediction

Woo, ChaeEun;
Lee, SuMin;
Park, Soo Min;
Kim, Byung Hyung

Citations

WEB OF SCIENCE

0

Citations

SCOPUS

0

초록

Video saliency prediction, which emulates the selective visual processing mechanisms of the human visual system, has found widespread applications across various domains. However, existing models often struggle to effectively integrate spatiotemporal features, particularly due to challenges associated with multi-resolution feature processing. To address these limitations, we propose the Recursive Saliency Network (RecSal-Net), a model designed to enhance the performance of video saliency prediction. The model adopts the Video Swin Transformer as its backbone to extract rich spatiotemporal features. In addition, a Recursive Feature Pyramid structure is introduced to integrate multi-resolution features while minimizing information loss. To further improve feature representation, a top-down feature integration strategy is employed, transferring highlevel semantic features to lower-level feature maps. This is complemented by iterative upsampling operations, which enrich both highand low-resolution representations. Experimental results demonstrate that RecSal-Net outperforms state-of-the-art methods on the DHF1K, Hollywood-2, and UCF Sports datasets, achieving superior performance across key evaluation metrics such as AUC-J, CC, and NSS. These findings validate the model's effectiveness in capturing long-range spatiotemporal dependencies and integrating multi-resolution features. Overall, our work underscores the potential of recursive feature modeling to advance future video saliency prediction frameworks. The code is available at https://github.com/affctivai/RecSal-Net.

키워드

Video saliency prediction; Recursive pyramid network; Video Swin Transformer; Multi-stage; SPATIOTEMPORAL SALIENCY

제목: RecSal-Net: Recursive Saliency Network for video saliency prediction

저자: Woo, ChaeEun; Lee, SuMin; Park, Soo Min; Kim, Byung Hyung

DOI: 10.1016/j.neucom.2025.130822

발행일: 2025-10

유형: Article

저널명: Neurocomputing

권: 650