Responsive listening head generation: a benchmark dataset and baseline

Z Huang, T Zhang, W Heng, B Shi, S Zhou - European Conference on …, 2022 - Springer

Real-time video frame interpolation (VFI) is very useful in video processing, media players,
and display devices. We propose RIFE, a Real-time Intermediate Flow Estimation algorithm …

被引用次数：391 相关文章所有 7 个版本

[PDF] arxiv.org

Deep person generation: A survey from the perspective of face, pose, and cloth synthesis

T Sha, W Zhang, T Shen, Z Li, T Mei - ACM Computing Surveys, 2023 - dl.acm.org

Deep person generation has attracted extensive research attention due to its wide
applications in virtual agents, video conferencing, online shopping, and art/movie …

被引用次数：40 相关文章所有 3 个版本

[PDF] thecvf.com

Can language models learn to listen?

E Ng, S Subramanian, D Klein… - Proceedings of the …, 2023 - openaccess.thecvf.com

We present a framework for generating appropriate facial responses from a listener in
dyadic social interactions based on the speaker's words. Given an input transcription of the …

被引用次数：18 相关文章所有 5 个版本

[PDF] thecvf.com

From audio to photoreal embodiment: Synthesizing humans in conversations

E Ng, J Romero, T Bagautdinov, S Bai… - Proceedings of the …, 2024 - openaccess.thecvf.com

We present a framework for generating full-bodied photorealistic avatars that gesture
according to the conversational dynamics of a dyadic interaction. Given speech audio we …

被引用次数：28 相关文章所有 3 个版本

[PDF] thecvf.com

Emotional listener portrait: Neural listener head generation with emotion

L Song, G Yin, Z Jin, X Dong… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Listener head generation centers on generating non-verbal behaviors (eg, smile) of a
listener in reference to the information delivered by a speaker. A significant challenge when …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Reactface: Multiple appropriate facial reaction generation in dyadic interactions

C Luo, S Song, W Xie, M Spitale, L Shen… - arXiv preprint arXiv …, 2023 - arxiv.org

In dyadic interaction, predicting the listener's facial reactions is challenging as different
reactions may be appropriate in response to the same speaker's behaviour. This paper …

被引用次数：13 相关文章所有 2 个版本

[PDF] arxiv.org

Reversible graph neural network-based reaction distribution learning for multiple appropriate facial reactions generation

T Xu, M Spitale, H Tang, L Liu, H Gunes… - arXiv preprint arXiv …, 2023 - arxiv.org

Generating facial reactions in a human-human dyadic interaction is complex and highly
dependent on the context since more than one facial reactions can be appropriate for the …

被引用次数：13 相关文章所有 2 个版本

[PDF] acm.org

Mfr-net: Multi-faceted responsive listening head generation via denoising diffusion model

J Liu, X Wang, X Fu, Y Chai, C Yu, J Dai… - Proceedings of the 31st …, 2023 - dl.acm.org

Face-to-face communication is a common scenario including roles of speakers and
listeners. Most existing research methods focus on producing speaker videos, while the …

被引用次数：8 相关文章所有 3 个版本

[PDF] google.com

Audio-driven talking head generation with transformer and 3d morphable model

R Huang, W Zhong, G Li - Proceedings of the 30th ACM International …, 2022 - dl.acm.org

In the task of talking head generation, it is hard to learn the mapping relationship between
generated head image and input audio signal. To tackle this challenge, we propose to learn …

被引用次数：15 相关文章所有 2 个版本

[PDF] arxiv.org

Emotional listener portrait: Realistic listener motion simulation in conversation

L Song, G Yin, Z Jin, X Dong… - 2023 IEEE/CVF …, 2023 - ieeexplore.ieee.org

Listener head generation centers on generating non-verbal behaviors (eg, smile) of a
listener in reference to the information delivered by a speaker. A significant challenge when …

被引用次数：10 相关文章所有 4 个版本