Indicspeech: text-to-speech corpus for indian languages

A Jain, M Guo, K Srinivasan, T Chen… - arXiv preprint arXiv …, 2021 - arxiv.org

Both image-caption pairs and translation pairs provide the means to learn deep
representations of and connections between languages. We use both types of pairs in …

被引用次数：85 相关文章所有 6 个版本

A study on the challenges and opportunities of speech recognition for Bengali language

MF Mridha, AQ Ohi, MA Hamid… - Artificial Intelligence …, 2022 - Springer

Speech recognition is a fascinating process that offers the opportunity to interact and
command the machine in the field of human-computer interactions. Speech recognition is a …

被引用次数：23 相关文章所有 5 个版本

[PDF] ieee.org

Mlphon: A multifunctional grapheme-phoneme conversion tool using finite state transducers

K Manohar, AR Jayan, R Rajan - IEEE Access, 2022 - ieeexplore.ieee.org

In this article we present the design and the development of a knowledge based
computational linguistic tool, Mlphon for Malayalam language. Mlphon computationally …

被引用次数：14 相关文章所有 5 个版本

[PDF] arxiv.org

Rasa: Building Expressive Speech Synthesis Systems for Indian Languages in Low-resource Settings

PS Varadhan, A Sankar, G Raju, MM Khapra - arXiv preprint arXiv …, 2024 - arxiv.org

We release Rasa, the first multilingual expressive TTS dataset for any Indian language,
which contains 10 hours of neutral speech and 1-3 hours of expressive speech for each of …

被引用次数：3 相关文章所有 4 个版本

[PDF] arxiv.org

Strategies in transfer learning for low-resource speech synthesis: Phone mapping, features input, and source language selection

P Do, M Coler, J Dijkstra, E Klabbers - arXiv preprint arXiv:2306.12040, 2023 - arxiv.org

We compare using a PHOIBLE-based phone mapping method and using phonological
features input in transfer learning for TTS in low-resource languages. We use diverse source …

被引用次数：4 相关文章所有 8 个版本

[PDF] arxiv.org

Indicvoices-r: Unlocking a massive multilingual multi-speaker speech corpus for scaling indian TTS

A Sankar, S Anand, PS Varadhan, S Thomas… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent advancements in text-to-speech (TTS) synthesis show that large-scale models
trained with extensive web data produce highly natural-sounding output. However, such …

被引用次数：1 相关文章所有 3 个版本

The ldc-il speech corpora

N Choudhary, DG Rao - 2020 23rd Conference of the Oriental …, 2020 - ieeexplore.ieee.org

This paper introduces the first set of speech corpora released in 2019 by the Linguistic Data
Consortium for Indian Languages (LDC-IL), a scheme under the Department of Higher …

被引用次数：14 相关文章

[PDF] jst.go.jp

SUST TTS Corpus: A phonetically-balanced corpus for Bangla text-to-speech synthesis

A Ahmad, MR Selim, MZ Iqbal… - Acoustical Science and …, 2021 - jstage.jst.go.jp

This paper presents the Shahjalal University of Science and Technology Text-To-Speech
Corpus (SUST TTS Corpus), a phonetically balanced speech corpus for Bangla speech …

被引用次数：9 相关文章所有 6 个版本

Data-efficient training strategies for neural TTS systems

KR Prajwal, CV Jawahar - Proceedings of the 3rd ACM India Joint …, 2021 - dl.acm.org

India is a country with thousands of languages and dialects spoken across a billion-strong
population. For multi-lingual content creation and accessibility, text-to-speech systems will …

被引用次数：12 相关文章

[PDF] arxiv.org

Challenges and opportunities of speech recognition for bengali language

MF Mridha, AQ Ohi, MA Hamid… - arXiv preprint arXiv …, 2021 - arxiv.org

Speech recognition is a fascinating process that offers the opportunity to interact and
command the machine in the field of human-computer interactions. Speech recognition is a …

被引用次数：4 相关文章所有 2 个版本