Robust 3D Human Avatar Reconstruction From Monocular Videos Using Depth Optimization and Camera Pose Estimation

Citations

WEB OF SCIENCE

1
Citations

SCOPUS

2

초록

This paper presents a novel approach for 3D human avatar reconstruction from monocular RGB videos, overcoming the limitations of existing template-based methods such as BANMo. We introduce a two-fold optimization framework: first, using RelPose++ for accurate camera pose estimation and second, incorporating depth maps for enhancing 3D shape reconstruction. Our method minimizes so-called intra-frame and inter-frame distances, optimizing both detailed frame-level accuracy and maintaining temporal coherence across multiple video frames. Extensive experiments on the MEAD, Multiface and FEED datasets demonstrate the superiority of our approach in generating realistic, deformable 3D avatars, achieving significant improvements in Chamfer distance and F-score compared to existing methods. This framework is particularly effective in complex scenarios, such as bust-shot videos with partial views of subjects, offering robust and high-quality 3D reconstructions.

키워드

Three-dimensional displaysVideosCamerasImage reconstructionAvatarsShapePose estimationDepth measurementRendering (computer graphics)Accuracy3D avatar reconstructiondeformable 3D avatarmonocular video integrationstructural alignment in 3D
제목
Robust 3D Human Avatar Reconstruction From Monocular Videos Using Depth Optimization and Camera Pose Estimation
저자
Kim, Kyung MinSong, Byung Cheol
DOI
10.1109/ACCESS.2025.3556445
발행일
2025
유형
Article
저널명
IEEE Access
13
페이지
57886 ~ 57897