Robust 3D Human Avatar Reconstruction From Monocular Videos Using Depth Optimization and Camera Pose Estimation

Kim, Kyung Min; Song, Byung Cheol

doi:10.1109/ACCESS.2025.3556445

상세 보기

Robust 3D Human Avatar Reconstruction From Monocular Videos Using Depth Optimization and Camera Pose Estimation

Kim, Kyung Min;
Song, Byung Cheol

Citations

WEB OF SCIENCE

1

Citations

SCOPUS

2

초록

This paper presents a novel approach for 3D human avatar reconstruction from monocular RGB videos, overcoming the limitations of existing template-based methods such as BANMo. We introduce a two-fold optimization framework: first, using RelPose++ for accurate camera pose estimation and second, incorporating depth maps for enhancing 3D shape reconstruction. Our method minimizes so-called intra-frame and inter-frame distances, optimizing both detailed frame-level accuracy and maintaining temporal coherence across multiple video frames. Extensive experiments on the MEAD, Multiface and FEED datasets demonstrate the superiority of our approach in generating realistic, deformable 3D avatars, achieving significant improvements in Chamfer distance and F-score compared to existing methods. This framework is particularly effective in complex scenarios, such as bust-shot videos with partial views of subjects, offering robust and high-quality 3D reconstructions.

키워드

Three-dimensional displays; Videos; Cameras; Image reconstruction; Avatars; Shape; Pose estimation; Depth measurement; Rendering (computer graphics); Accuracy; 3D avatar reconstruction; deformable 3D avatar; monocular video integration; structural alignment in 3D

제목: Robust 3D Human Avatar Reconstruction From Monocular Videos Using Depth Optimization and Camera Pose Estimation

저자: Kim, Kyung Min; Song, Byung Cheol

DOI: 10.1109/ACCESS.2025.3556445

발행일: 2025

유형: Article

저널명: IEEE Access

권: 13

페이지: 57886 ~ 57897