Multi-Scale Feature-Based Spatiotemporal Pyramid Network for Hand Gesture Recognition

Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

Effectively capturing the spatiotemporal features of hand gestures from sequence data is crucial for gesture recognition. Existing work has effectively obtained motion features from between neighboring frames through well-designed temporal modeling networks; however, less attention has been paid to the spatial information contained in each frame. These approaches ignore the implicit complementary advantages of multi-scale appearance representations, which are essential to gesture recognition. We propose a multi-scale, feature-based spatiotemporal pyramid network for hand gesture recognition. It has a top-down, lateral-connection architecture designed to fuse spatial and temporal features from multiple scales in each layer. The network first outputs a coarse feature in a feedforward pass and then refines this feature in the top-down pass using features from successive lower layers. Similar to skip connections, our approach uses features from each layer of the network, but does not attempt to output independent predictions in each layer. Furthermore, we introduce a spatiotemporal pyramid module formed by stacking multiple successive refinement modules to fuse the multi -scale spatial feature output from each layer. We evaluate the proposed model with two publicly available benchmark hand gesture datasets. The model achieved accuracies of 85.1% and 95.4% for depth modality in the NVGesture and EgoGesture datasets, respectively. The comparison results show that the proposed hand gesture recognition method outperforms existing state-of-the-art methods.

키워드

Deep LearningHand Gesture RecognitionPyramid NetworkSpatiotemporal Feature
제목
Multi-Scale Feature-Based Spatiotemporal Pyramid Network for Hand Gesture Recognition
저자
Cao, ZongjingLi, YanShin, Byeong-Seok
DOI
10.22967/HCIS.2022.12.046
발행일
2022-10-15
유형
Article
저널명
Human-centric Computing and Information Sciences
12