Meta-tts: Meta-learning for few-shot speaker adaptive text-to-speech SF Huang, CJ Lin, DR Liu, YC Chen, H Lee IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1558-1571, 2022 | 52 | 2022 |
Audio word2vec: Sequence-to-sequence autoencoding for unsupervised learning of audio segmentation and representation YC Chen, SF Huang, H Lee, YH Wang, CH Shen IEEE/ACM Transactions on Audio, Speech, and Language Processing 27 (9), 1481 …, 2019 | 41 | 2019 |
Phonetic-and-semantic embedding of spoken words with applications in spoken content retrieval YC Chen, SF Huang, CH Shen, HY Lee, LS Lee 2018 IEEE Spoken Language Technology Workshop (SLT), 941-948, 2018 | 40 | 2018 |
Pretrained language model embryology: The birth of ALBERT CH Chiang, SF Huang, H Lee arXiv preprint arXiv:2010.02480, 2020 | 29 | 2020 |
Stabilizing label assignment for speech separation by self-supervised pre-training SF Huang, SP Chuang, DR Liu, YC Chen, GP Yang, H Lee arXiv preprint arXiv:2010.15366, 2020 | 23* | 2020 |
Towards unsupervised automatic speech recognition trained by unaligned speech and text only YC Chen, CH Shen, SF Huang, H Lee arXiv preprint arXiv:1803.10952, 2018 | 18 | 2018 |
Almost-unsupervised speech recognition with close-to-zero resource based on phonetic structures learned from very small unpaired speech and text data YC Chen, CH Shen, SF Huang, H Lee, L Lee arXiv preprint arXiv:1810.12566, 2018 | 13 | 2018 |
Non-autoregressive mandarin-english code-switching speech recognition SP Chuang, HJ Chang, SF Huang, H Lee 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2021 | 12 | 2021 |
Speechnet: A universal modularized model for speech processing tasks YC Chen, PH Chi, S Yang, KW Chang, J Lin, SF Huang, DR Liu, CL Liu, ... arXiv preprint arXiv:2105.03070, 2021 | 11 | 2021 |
Learning phone recognition from unpaired audio and phone sequences based on generative adversarial network D Liu, P Hsu, Y Chen, S Huang, S Chuang, D Wu, H Lee IEEE/ACM transactions on audio, speech, and language processing 30, 230-243, 2021 | 7 | 2021 |
Improved audio embeddings by adjacency-based clustering with applications in spoken term detection SF Huang, YC Chen, H Lee, L Lee arXiv preprint arXiv:1811.02775, 2018 | 7 | 2018 |
Few-shot cross-lingual tts using transferable phoneme embedding WP Huang, PC Chen, SF Huang, H Lee arXiv preprint arXiv:2206.15427, 2022 | 2 | 2022 |
From semi-supervised to almost-unsupervised speech recognition with very-low resource by jointly learning phonetic structures from audio and text embeddings YC Chen, SF Huang, H Lee, L Lee arXiv preprint arXiv:1904.05078, 2019 | 2 | 2019 |
Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization WP Huang, SF Huang, H Lee 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023 | | 2023 |
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning SF Huang, CP Chen, ZS Chen, YP Tsai, H Lee ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | | 2023 |