关注
Wei-Ning Hsu
Wei-Ning Hsu
Facebook AI Research (FAIR)
在 csail.mit.edu 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Hubert: Self-supervised speech representation learning by masked prediction of hidden units
WN Hsu, B Bolte, YHH Tsai, K Lakhotia, R Salakhutdinov, A Mohamed
IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 3451-3460, 2021
21952021
Data2vec: A general framework for self-supervised learning in speech, vision and language
A Baevski, WN Hsu, Q Xu, A Babu, J Gu, M Auli
International Conference on Machine Learning, 1298-1312, 2022
7252022
An unsupervised autoregressive model for speech representation learning
YA Chung, WN Hsu, H Tang, J Glass
INTERSPEECH, 2019
4422019
Unsupervised learning of disentangled and interpretable representations from sequential data
WN Hsu, Y Zhang, J Glass
Thirty-first Conference on Neural Information Processing Systems (NeurIPS), 2017
4052017
Hierarchical generative modeling for controllable speech synthesis
WN Hsu, Y Zhang, RJ Weiss, H Zen, Y Wu, Y Wang, Y Cao, Y Jia, Z Chen, ...
Seventh International Conference on Learning Representations (ICLR), 2019
296*2019
Unsupervised speech recognition
A Baevski, WN Hsu, A Conneau, M Auli
Advances in Neural Information Processing Systems 34, 27826-27839, 2021
2792021
On generative spoken language modeling from raw audio
K Lakhotia, E Kharitonov, WN Hsu, Y Adi, A Polyak, B Bolte, TA Nguyen, ...
Transactions of the Association for Computational Linguistics 9, 1336-1354, 2021
2702021
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
A Polyak, Y Adi, J Copet, E Kharitonov, K Lakhotia, WN Hsu, A Mohamed, ...
INTERSPEECH, 2021
2432021
Learning audio-visual speech representation by masked multimodal cluster prediction
B Shi, WN Hsu, K Lakhotia, A Mohamed
arXiv preprint arXiv:2201.02184, 2022
2332022
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training
WN Hsu, A Sriram, A Baevski, T Likhomanenko, Q Xu, V Pratap, J Kahn, ...
INTERSPEECH, 2021
2262021
Lingvo: a modular and scalable framework for sequence-to-sequence modeling
J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ...
arXiv preprint arXiv:1902.08295, 2019
2022019
Active learning by learning
WN Hsu, HT Lin
Proceedings of the AAAI Conference on Artificial Intelligence 29 (1), 2015
1942015
Learning Latent Representations for Speech Generation and Transformation
WN Hsu, Y Zhang, J Glass
INTERSPEECH, 1273-1277, 2017
1822017
Scaling speech technology to 1,000+ languages
V Pratap, A Tjandra, B Shi, P Tomasello, A Babu, S Kundu, A Elkahky, ...
Journal of Machine Learning Research 25 (97), 1-52, 2024
1642024
Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation
WN Hsu, Y Zhang, J Glass
2017 IEEE automatic speech recognition and understanding workshop (ASRU), 16-23, 2017
1642017
Semi-supervised training for improving data efficiency in end-to-end speech synthesis
YA Chung, Y Wang, WN Hsu, Y Zhang, RJ Skerry-Ryan
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1382019
Direct speech-to-speech translation with discrete units
A Lee, PJ Chen, C Wang, J Gu, S Popuri, X Ma, A Polyak, Y Adi, Q He, ...
arXiv preprint arXiv:2107.05604, 2021
1352021
Disentangling correlated speaker and noise for speech synthesis via data augmentation and adversarial factorization
WN Hsu, Y Zhang, RJ Weiss, YA Chung, Y Wang, Y Wu, J Glass
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1242019
Textless speech-to-speech translation on real data
A Lee, H Gong, PA Duquenne, H Schwenk, PJ Chen, C Wang, S Popuri, ...
arXiv preprint arXiv:2112.08352, 2021
1142021
Voicebox: Text-guided multilingual universal speech generation at scale
M Le, A Vyas, B Shi, B Karrer, L Sari, R Moritz, M Williamson, V Manohar, ...
Advances in neural information processing systems 36, 2024
1132024
系统目前无法执行此操作,请稍后再试。
文章 1–20