상세 보기
Enhancing Sign Language Recognition: A Computationally Efficient and Interpretable Multi-Modal Approach
- Tursunbaev, Chingiz;
- Toshpulatov, Mukhiddin;
- Adhikari, Nirmal;
- Lee, Wookey
WEB OF SCIENCE
0SCOPUS
0초록
Sign Language Recognition (SLR) systems typically face a dichotomy between accuracy and computational efficiency. High-performance models often rely on computationally expensive optical flow estimation or massive graph neural networks to capture the nuanced dynamics of sign gestures, rendering them unsuitable for real-time applications. In this paper, we propose a novel Doppler-Enhanced Multi-Stream Network that resolves this trade-off. Instead of relying on heavy flow estimation, we introduce a lightweight "Doppler Motion Stream"that utilizes frame-difference heatmaps to explicitly model high-speed gesture dynamics. By integrating this motion stream with traditional RGB and Keypoint modalities, our architecture effectively isolates semantic movement from background noise. We address critical challenges such as occluded hand shapes and rapid motions through this specialized multi-modal fusion. Experimental results on the Phoenix-2014 and CSL-Daily datasets demonstrate that our approach achieves State-of-the-Art (SOTA) performance (18.05% WER) while maintaining a highly compact model size of 74 MB. This work sets a new benchmark for scalable, interpretable, and robust SLR systems. © 2026 IEEE.
키워드
- 제목
- Enhancing Sign Language Recognition: A Computationally Efficient and Interpretable Multi-Modal Approach
- 저자
- Tursunbaev, Chingiz; Toshpulatov, Mukhiddin; Adhikari, Nirmal; Lee, Wookey
- 발행일
- 2025
- 유형
- Proceedings Paper
- 저널명
- Proceedings of the IEEE International Conference on Big Data and Smart Computing, BIGCOMP
- 호
- 2026
- 페이지
- 390 ~ 393