Talking human face generation: A survey

Citations

WEB OF SCIENCE

27
Citations

SCOPUS

36

초록

Talking human face generation aims at synthesizing a natural human face that talks in correspondence to the given text or audio series. Implementing the recently developed Deep Learning (DL) methods such as Convolutional Neural Networks (CNN), Generative Adversarial Networks (GAN)s, Neural Rendering Fields (NeRF) for data generation, and talking human face generation has attracted significant research interest from academia and industry. They have been explored and exploited recently and have been used to address several problems in image processing and computer vision. Notwithstanding notable advancements, implementing them to real-world problems such as talking human face generation remains challenging. The generation of deepfakes created by the abovementioned methods would greatly promote many fascinating applications, including augmented reality, virtual reality, computer games, teleconferencing, virtual try-on, special movie effects, and avatars. This research reviews and discusses DL related methods, including CNN, GANs, NeRF, and their implementation in talking human face generation. We aim to analyze existing approaches regarding their implementation to talking face generation, investigate the related general problems, and highlight the open study issues. We also provide quantitative and qualitative evaluations of the existing research approaches in the related field.

키워드

Talking human face animation3D face generationDeep generative modelAutoencoderNeural radiance fieldDatasetsEvaluation metricsNeural networksUnsupervised learningMel spectogramADVERSARIAL NETWORKS3DGANCLASSIFICATIONMODELGEOCHEMISTRY
제목
Talking human face generation: A survey
저자
Toshpulatov, MukhiddinLee, WookeyLee, Suan
DOI
10.1016/j.eswa.2023.119678
발행일
2023-06-01
유형
Review
저널명
Expert Systems with Applications
219