Hierarchical cross-modal talking face generation with dynamic pixel-wise loss- 学术资源搜索

文章

学术资源搜索

Hierarchical cross-modal talking face generation with dynamic pixel-wise loss

L Chen, RK Maddox, Z Duan… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com

We devise a cascade GAN approach to generate talking face video, which is robust to
different face shapes, view angles, facial characteristics, and noisy audio conditions. Instead
of learning a direct mapping from audio to video frames, we propose first to transfer audio to
high-level structure, ie, the facial landmarks, and then to generate video frames conditioned
on the landmarks. Compared to a direct audio-to-image approach, our cascade approach
avoids fitting spurious correlations between audiovisual signals that are irrelevant to the …

被引用次数：447 相关文章所有 10 个版本

[PDF] arxiv.org

Hierarchical cross-modal talking face generationwith dynamic pixel-wise loss

L Chen, RK Maddox, Z Duan, C Xu - arXiv preprint arXiv:1905.03820, 2019 - arxiv.org

被引用次数：10 相关文章所有 3 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果