Talking Head Generation Based on 3D Morphable Facial Model

Hsin Yu Shen, Wen Jiin Tsai

研究成果: Conference contribution同行評審

摘要

This paper presents a framework for one-shot talking-head video generation which takes a single person image and audio clips as input and synthesizes photo-realistic videos with natural head-poses and lip motion synced to the driving audio. The main idea behind this framework is to use 3D Morphable Model (3DMM) parameters as intermediate representation in generating the videos. We design an Expression Predictor and a Head Pose Predictor to predict facial expression and head-pose parameters from audio, respectively, and adopt a 3DMM model to extract identity and texture parameters from the reference image. With these parameters, facial images are rendered as an auxiliary to guide video generation. Compared to widely used facial landmarks, 3DMM parameters are more powerful in representing facial details. Experimental results show that our method can generate realistic talking-head videos and outperform many state-of-the-art methods.

原文English
主出版物標題2024 Picture Coding Symposium, PCS 2024 - Proceedings
發行者Institute of Electrical and Electronics Engineers Inc.
ISBN(電子)9798350358483
DOIs
出版狀態Published - 2024
事件2024 Picture Coding Symposium, PCS 2024 - Taichung, Taiwan
持續時間: 12 6月 202414 6月 2024

出版系列

名字2024 Picture Coding Symposium, PCS 2024 - Proceedings

Conference

Conference2024 Picture Coding Symposium, PCS 2024
國家/地區Taiwan
城市Taichung
期間12/06/2414/06/24

指紋

深入研究「Talking Head Generation Based on 3D Morphable Facial Model」主題。共同形成了獨特的指紋。

引用此