Egocentric Hand Activity Video Dataset and Bidirectional Motion-Priors for Hand Action Recognition

  • Seo, Jiyoung
  • In Lee, Dong
  • Lee, Pilhyeon
  • Lee, Jiwoo
  • Gil, Younhee
  • 외 2명
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

Recognizing tool-based hand activities from a first-person view is a critical yet challenging task in computer vision, due to the complexity of hand-object interactions and often subtle, ambiguous motion patterns. In real-world manufacturing scenarios, these challenges are exacerbated by bidirectional action pairs whose visual cues are almost identical, with differences revealed only through subtle motion dynamics. However, existing datasets rarely capture these direction-sensitive interactions at scale, particularly in realistic tool-use contexts, limiting the ability of current models to learn fine-grained motion dynamics essential for accurate recognition. We introduce Ego-Bi (Egocentric-Bidirectional dataset), a large-scale, real-world egocentric RGB video dataset comprising 1,223 video sequences and 622,737 frames that cover diverse tool-use activities in unconstrained environments. Ego-Bi provides an extended 38-category hand type taxonomy, detailed object-tool labels, and challenging bidirectional action pairs, offering rich semantic and temporal cues for modeling complex hand-object interactions. In addition, to address the ambiguity in motion dynamics, we propose a BMP (Bidirectional Motion Prior module) that derives rotation and directional cues from predicted 3D hand poses to improve class separability of visually similar actions. Experimental results on Ego-Bi demonstrate that our approach improves bidirectional action recognition accuracy by + 8.96% over the baseline, while also yielding consistent gains across general action classes without requiring costly 3D pose annotations. Furthermore, the proposed motion priors generalize effectively to other egocentric benchmarks, underscoring their robustness in handling visually similar, direction-sensitive actions.

키워드

HandsTaxonomyDynamicsThree-dimensional displaysVideosSemanticsAnnotationsTransformersThumbPipelinesHand action recognitionhand-object interactiondynamic motion cuehand type taxonomyhand pose estimation
제목
Egocentric Hand Activity Video Dataset and Bidirectional Motion-Priors for Hand Action Recognition
저자
Seo, JiyoungIn Lee, DongLee, PilhyeonLee, JiwooGil, YounheeRamani, KarthikKim, Sangpil
DOI
10.1109/ACCESS.2026.3652803
발행일
2026
유형
Article
저널명
IEEE Access
14
페이지
8128 ~ 8148