MTSNet: Joint Feature Adaptation and Enhancement for Text-Guided Multi-view Martian Terrain Segmentation

  • Fang, Yang
  • Rao, Xuefeng
  • Gao, Xinbo
  • Li, Weisheng
  • Min, Zijian
Citations

WEB OF SCIENCE

2
Citations

SCOPUS

4

초록

Martian terrain segmentation plays a crucial role in autonomous navigation and safe driving of Mars rovers as well as global analysis of Martian geological landforms. However, most deep learning-based segmentation models cannot effectively handle the challenges of highly unstructured and unbalanced terrain distribution on the Martian surface, thus leading to inadequate adaptability and generalization ability. In this paper, we propose a novel multi-view Martian Terrain Segmentation framework (MTSNet) by developing an efficient Martian Terrain text-Guided Segment Anything Model (MTG-SAM) and combining it with a tailored Local Terrain Feature Enhancement Network (LTEN) to capture intricate terrain details. Specifically, the proposed MTG-SAM is equipped with a Terrain Context attention Adapter Module (TCAM) to efficiently and effectively unleashing the model adaptability and transferability on Mars-specific terrain distribution. Then, a Local Terrain Feature Enhancement Network (LTEN) is designated to compensate for the limitations of MTG-SAM in capturing the fine-grained local terrain features of Mars surface. Afterwards, a simple yet efficient Gated Fusion Module (GFM) is introduced to dynamically merge the global contextual features from MTG-SAM encoder and the local refined features from LTEN module for comprehensive terrain feature learning. Moreover, the proposed MTSNet enables terrain-specific text as prompts resolving the efficiency issue of existing methods that require costly annotation of bounding boxes or foreground points. Experimental results on AI4Mars and ConeQuest datasets demonstrate that our proposed MTSNet can effectively learns the unique Martian terrain feature distribution and achieves state-of-the-art performance on multi-view terrain segmentation from both the perspectives of the Mars rover and the satellite remote sensing. Code is available at https://github.com/raoxuefeng/mtsnet. © 2024 ACM.

키워드

contextual attention adapterfeature enhancementgated fusionmartian terrain segmentationtext prompt encoder
제목
MTSNet: Joint Feature Adaptation and Enhancement for Text-Guided Multi-view Martian Terrain Segmentation
저자
Fang, YangRao, XuefengGao, XinboLi, WeishengMin, Zijian
DOI
10.1145/3664647.3681430
발행일
2024
유형
Proceedings Paper
저널명
MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
페이지
5092 ~ 5101