Starganv2-vc: A diverse, unsupervised, non-parallel framework for natural-sounding voice conversion YA Li, A Zare, N Mesgarani arXiv preprint arXiv:2107.10394, 2021 | 78 | 2021 |
Simple framework for constructing functional spiking recurrent neural networks R Kim, Y Li, TJ Sejnowski Proceedings of the national academy of sciences 116 (45), 22811-22820, 2019 | 69 | 2019 |
Styletts 2: Towards human-level text-to-speech through style diffusion and adversarial training with large speech language models YA Li, C Han, V Raghavan, G Mischler, N Mesgarani Advances in Neural Information Processing Systems 36, 2024 | 34 | 2024 |
Styletts: A style-based generative model for natural and diverse text-to-speech synthesis YA Li, C Han, N Mesgarani arXiv preprint arXiv:2205.15439, 2022 | 26 | 2022 |
Styletts-vc: One-shot voice conversion by knowledge transfer from style-based tts models YA Li, C Han, N Mesgarani 2022 IEEE Spoken Language Technology Workshop (SLT), 920-927, 2023 | 12 | 2023 |
Phoneme-level bert for enhanced prosody of text-to-speech with grapheme predictions YA Li, C Han, X Jiang, N Mesgarani ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 8 | 2023 |
Learning the synaptic and intrinsic membrane dynamics underlying working memory in spiking neural network models Y Li, R Kim, TJ Sejnowski Neural Computation 33 (12), 3264-3287, 2021 | 4 | 2021 |
Slmgan: Exploiting speech language model representations for unsupervised zero-shot voice conversion in gans YA Li, C Han, N Mesgarani 2023 IEEE Workshop on Applications of Signal Processing to Audio and …, 2023 | 3 | 2023 |
Contextual Feature Extraction Hierarchies Converge in Large Language Models and the Brain G Mischler, YA Li, S Bickel, AD Mehta, N Mesgarani arXiv preprint arXiv:2401.17671, 2024 | 2 | 2024 |
Improved decoding of attentional selection in multi-talker environments with self-supervised learned speech representation C Han, V Choudhari, YA Li, N Mesgarani 2023 45th Annual International Conference of the IEEE Engineering in …, 2023 | 2 | 2023 |
Supervised spike sorting using deep convolutional siamese network and hierarchical clustering Y Li, S Tang, VR de Sa unpublished thesis, 2019 | 2 | 2019 |
Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience X Jiang, C Han, YA Li, N Mesgarani arXiv preprint arXiv:2402.03710, 2024 | 1 | 2024 |
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform YA Li, C Han, X Jiang, N Mesgarani arXiv preprint arXiv:2309.09493, 2023 | 1 | 2023 |
Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis X Jiang, YA Li, AN Florea, C Han, N Mesgarani arXiv preprint arXiv:2407.09732, 2024 | | 2024 |
Exploring Self-supervised Contrastive Learning of Spatial Sound Event Representation X Jiang, C Han, YA Li, N Mesgarani ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | | 2024 |
DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes X Jiang, YA Li, N Mesgarani arXiv preprint arXiv:2305.18441, 2023 | | 2023 |