Fre-GAN: Adversarial frequency-consistent audio synthesis JH Kim, SH Lee, JH Lee, SW Lee arXiv preprint arXiv:2106.02297, 2021 | 62 | 2021 |
Multi-spectrogan: High-diversity and high-fidelity spectrogram generation with adversarial style combination for speech synthesis SH Lee, HW Yoon, HR Noh, JH Kim, SW Lee Proceedings of the AAAI Conference on Artificial Intelligence 35 (14), 13198 …, 2021 | 56 | 2021 |
HierSpeech: Bridging the Gap between Text and Speech by Hierarchical Variational Inference using Self-supervised Representations for Speech Synthesis SH Lee, SB Kim, JH Lee, E Song, MJ Hwang, SW Lee Advances in Neural Information Processing Systems, 2022 | 39 | 2022 |
Voicemixer: Adversarial voice style mixup SH Lee, JH Kim, H Chung, SW Lee Advances in Neural Information Processing Systems 34, 294-308, 2021 | 32 | 2021 |
Emoq-tts: Emotion intensity quantization for fine-grained controllable emotional text-to-speech CB Im, SH Lee, SB Kim, SW Lee ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 31 | 2022 |
Duration controllable voice conversion via phoneme-based information bottleneck SH Lee, HR Noh, WJ Nam, SW Lee IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1173-1183, 2022 | 20 | 2022 |
Hierspeech++: Bridging the gap between semantic and acoustic representation of speech by hierarchical variational inference for zero-shot speech synthesis SH Lee, HY Choi, SB Kim, SW Lee arXiv preprint arXiv:2311.12454, 2023 | 16 | 2023 |
Reinforce-aligner: Reinforcement alignment search for robust end-to-end text-to-speech H Chung, SH Lee, SW Lee arXiv preprint arXiv:2106.02830, 2021 | 16 | 2021 |
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation HY Choi, SH Lee, SW Lee Interspeech 2023, 0 | 15* | |
Dddm-vc: Decoupled denoising diffusion models with disentangled representation and prior mixup for verified robust voice conversion HY Choi, SH Lee, SW Lee Proceedings of the AAAI Conference on Artificial Intelligence 38 (16), 17862 …, 2024 | 14 | 2024 |
Fre-gan 2: Fast and efficient frequency-consistent audio synthesis SH Lee, JH Kim, KE Lee, SW Lee ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 14 | 2022 |
PVAE-TTS: Adaptive text-to-speech via progressive style adaptation JH Lee, SH Lee, JH Kim, SW Lee ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 14 | 2022 |
Audio dequantization for high fidelity audio generation in flow-based neural vocoder HW Yoon, SH Lee, HR Noh, SW Lee arXiv preprint arXiv:2008.06867, 2020 | 13 | 2020 |
Hiddensinger: High-quality singing voice synthesis via neural audio codec and latent diffusion models JS Hwang, SH Lee, SW Lee arXiv preprint arXiv:2306.06814, 2023 | 9 | 2023 |
Diffprosody: Diffusion-based latent prosody generation for expressive speech synthesis with prosody conditional adversarial training HS Oh, SH Lee, SW Lee IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 8 | 2024 |
HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer SH Lee, HY Choi, HS Oh, SW Lee arXiv preprint arXiv:2307.16171, 2023 | 7 | 2023 |
GC-TTS: Few-shot speaker adaptation with geometric constraints JH Kim, SH Lee, JH Lee, HG Jung, SW Lee 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC …, 2021 | 7 | 2021 |
Audio Super-Resolution With Robust Speech Representation Learning of Masked Autoencoder SB Kim, SH Lee, HY Choi, SW Lee IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 3 | 2024 |
PauseSpeech: Natural speech synthesis via pre-trained language model and pause-based prosody modeling JS Hwang, SH Lee, SW Lee Asian Conference on Pattern Recognition, 415-427, 2023 | 3 | 2023 |
StyleVC: Non-Parallel Voice Conversion with Adversarial Style Generalization IS Hwang, SH Lee, SW Lee 2022 26th International Conference on Pattern Recognition (ICPR), 23-30, 2022 | 3 | 2022 |