Hifi-codec: Group-residual vector quantization for high fidelity audio codec D Yang, S Liu, R Huang, J Tian, C Weng, Y Zou arXiv preprint arXiv:2305.02765, 2023 | 37 | 2023 |
Uniaudio: An audio foundation model toward universal audio generation D Yang, J Tian, X Tan, R Huang, S Liu, X Chang, J Shi, S Zhao, J Bian, ... arXiv preprint arXiv:2310.00704, 2023 | 34 | 2023 |
Reproducing whisper-style training using an open-source toolkit and publicly available data Y Peng, J Tian, B Yan, D Berrebbi, X Chang, X Li, J Shi, S Arora, W Chen, ... 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023 | 14 | 2023 |
LAE: Language-aware encoder for monolingual and multilingual asr J Tian, J Yu, C Zhang, C Weng, Y Zou, D Yu Interspeech 2022, 2022 | 14 | 2022 |
Consistent training and decoding for end-to-end speech recognition using lattice-free mmi J Tian, J Yu, C Weng, SX Zhang, D Su, D Yu, Y Zou ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 13 | 2022 |
Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model J Tian, J Yu, C Weng, Y Zou, D Yu IEEE Signal Processing Letters 29, 812-816, 2022 | 10 | 2022 |
OWSM v3. 1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer Y Peng, J Tian, W Chen, S Arora, B Yan, Y Sudo, M Shakeel, K Choi, ... arXiv preprint arXiv:2401.16658, 2024 | 9 | 2024 |
Exploring speech recognition, translation, and understanding with discrete speech units: A comparative study X Chang, B Yan, K Choi, JW Jung, Y Lu, S Maiti, R Sharma, J Shi, J Tian, ... ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 8 | 2024 |
Integrating Lattice-Free MMI into End-to-End Speech Recognition J Tian, J Yu, C Weng, Y Zou, D Yu IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2022 | 8* | 2022 |
Bayes risk CTC: Controllable CTC alignment in Sequence-to-Sequence tasks J Tian, B Yan, J Yu, C Weng, D Yu, S Watanabe International Conference on Learning Representations (ICLR) 2023, 2022 | 7 | 2022 |
A random gossip BMUF process for neural language modeling Y Huang, J Tian, L Han, G Wang, X Song, D Su, D Yu ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 3 | 2020 |
The MineTrans Systems for IWSLT 2023 Offline Speech Translation and Speech-to-Speech Translation Tasks Y Du, G Zhengsheng, J Tian, Z Zhang, X Wang, J Yu, Z Tu, T Xu, E Chen Proceedings of the 20th International Conference on Spoken Language …, 2023 | 2 | 2023 |
Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction Z Zhao, R Gu, D Yang, J Tian, Y Zou Interspeech 2022, 2022 | 2 | 2022 |
AutoPrep: An Automatic Preprocessing Framework for In-The-Wild Speech Data J Yu, H Chen, Y Bian, X Li, Y Luo, J Tian, M Liu, J Jiang, S Wang ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | | 2024 |
Bayes Risk Transducer: Transducer with Controllable Alignment Prediction J Tian, J Yu, H Chen, B Yan, C Weng, D Yu, S Watanabe arXiv preprint arXiv:2308.10107, 2023 | | 2023 |
UniAudio: Towards Universal Audio Generation with Large Language Models D Yang, J Tian, X Tan, R Huang, S Liu, H Guo, X Chang, J Shi, J Bian, ... Forty-first International Conference on Machine Learning, 0 | | |
MVoice: Multilingual Unified Voice Generation With Discrete Representation at Scale R Huang, C Zhang, Y Wang, D Yang, J Tian, L Liu, Z Ye, Z Jiang, ... | | |