Controllable Accented Text-to-Speech Synthesis With Fine and Coarse-Grained Intensity Rendering
Accented text-to-speech (TTS) synthesis seeks to generate speech with an accent (L2) as a
variant of the standard version (L1), which is challenging as L2 is different from L1 in terms …
variant of the standard version (L1), which is challenging as L2 is different from L1 in terms …
MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition
Despite notable advancements in automatic speech recognition (ASR), performance tends
to degrade when faced with adverse conditions. Generative error correction (GER) …
to degrade when faced with adverse conditions. Generative error correction (GER) …