关注
Yuping Wang
Yuping Wang
ByteDance
在 bytedance.com 的电子邮件经过验证
标题
引用次数
引用次数
年份
Audioldm 2: Learning holistic audio generation with self-supervised pretraining
H Liu, Y Yuan, X Liu, X Mei, Q Kong, Q Tian, Y Wang, W Wang, Y Wang, ...
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
842024
Efficient neural music generation
MWY Lam, Q Tian, T Li, Z Yin, S Feng, M Tu, Y Ji, R Xia, M Ma, X Song, ...
Advances in Neural Information Processing Systems 36, 2024
382024
Neural dubber: Dubbing for videos according to scripts
C Hu, Q Tian, T Li, W Yuping, Y Wang, H Zhao
Advances in neural information processing systems 34, 16582-16595, 2021
312021
Lm-vc: Zero-shot voice conversion via speech generation based on language models
Z Wang, Y Chen, L Xie, Q Tian, Y Wang
IEEE Signal Processing Letters, 2023
232023
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
P Anastassiou, J Chen, J Chen, Y Chen, Z Chen, Z Chen, J Cong, L Deng, ...
arXiv preprint arXiv:2406.02430, 2024
212024
Neufa: Neural network based end-to-end forced alignment with bidirectional attention mechanism
J Li, Y Meng, Z Wu, H Meng, Q Tian, Y Wang, Y Wang
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
202022
Polyvoice: Language models for speech to speech translation
Q Dong, Z Huang, Q Tian, C Xu, T Ko, Y Zhao, S Feng, T Li, K Wang, ...
arXiv preprint arXiv:2306.02982, 2023
192023
Controllable and lossless non-autoregressive end-to-end text-to-speech
Z Liu, Q Tian, C Hu, X Liu, M Wu, Y Wang, H Zhao, Y Wang
arXiv preprint arXiv:2207.06088, 2022
132022
Inferring speaking styles from multi-modal conversational context by multi-scale relational graph convolutional networks
J Li, Y Meng, X Wu, Z Wu, J Jia, H Meng, Q Tian, Y Wang, Y Wang
Proceedings of the 30th ACM International Conference on Multimedia, 5811-5820, 2022
122022
Cloning one’s voice using very limited data in the wild
D Dai, Y Chen, L Chen, M Tu, L Liu, R Xia, Q Tian, Y Wang, Y Wang
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
102022
DiCLET-TTS: Diffusion model based cross-lingual emotion transfer for text-to-speech—A study between English and Mandarin
T Li, C Hu, J Cong, X Zhu, J Li, Q Tian, Y Wang, L Xie
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023
82023
Streaming voice conversion via intermediate bottleneck features and non-streaming teacher guidance
Y Chen, M Tu, T Li, X Li, Q Kong, J Li, Z Wang, Q Tian, Y Wang, Y Wang
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
72023
Seed-asr: Understanding diverse speech and contexts with llm-based speech recognition
Y Bai, J Chen, J Chen, W Chen, Z Chen, C Ding, L Dong, Q Dong, Y Du, ...
arXiv preprint arXiv:2407.04675, 2024
42024
Joint Multiscale Cross-Lingual Speaking Style Transfer With Bidirectional Attention Mechanism for Automatic Dubbing
J Li, S Li, P Chen, L Zhang, Y Meng, Z Wu, H Meng, Q Tian, Y Wang, ...
IEEE/ACM Transactions on Audio, Speech, and Language Processing 32, 517-528, 2023
32023
MSM-VC: high-fidelity source style transfer for non-parallel voice conversion by multi-scale style modeling
Z Wang, X Wang, Q Xie, T Li, L Xie, Q Tian, Y Wang
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023
32023
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion
Z Wang, Y Chen, X Wang, Z Chen, L Xie, Y Wang, Y Wang
arXiv preprint arXiv:2401.11053, 2024
22024
Delivering speaking style in low-resource voice conversion with multi-factor constraints
Z Wang, X Wang, L Xie, Y Chen, Q Tian, Y Wang
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
22023
Multi-level temporal-channel speaker retrieval for robust zero-shot voice conversion
Z Wang, L Xue, Q Kong, L Xie, Y Chen, Q Tian, Y Wang
arXiv preprint arXiv:2305.07204, 2023
22023
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning
T Li, Z Wang, X Zhu, J Cong, Q Tian, Y Wang, L Xie
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
2024
Zero-Shot Accent Conversion using Pseudo Siamese Disentanglement Network
D Jia, Q Tian, K Peng, J Li, Y Chen, M Ma, Y Wang, Y Wang
arXiv preprint arXiv:2212.05751, 2022
2022
系统目前无法执行此操作,请稍后再试。
文章 1–20