Experiences in Development of Hindi Speech Corpora based on ELDA standards

[PDF][PDF] Development of Text and Speech database for Hindi and Indian English specific to Mobile Communication environment.

SS Agrawal, S Sinha, P Singh, JØ Olsen - LREC, 2012 - Citeseer

This paper describes the method and experiences of text and speech data collection in
mobile communication in Indian English Hindi. The primary data collection is done in the …

被引用次数：17 相关文章所有 5 个版本

[PDF] sciencedirect.com

Collaborative speech data acquisition for under resourced languages through crowdsourcing

S Arora, KK Arora, MK Roy, SS Agrawal… - Procedia Computer …, 2016 - Elsevier

Scarcity of resources in under resourced languages may leave these languages behind in
race of development of data driven NLP systems. Crowdsourcing has come up as a …

被引用次数：11 相关文章所有 7 个版本

[PDF] academia.edu

Development of Hindi mobile communication text and speech corpus

S Sinha, SS Agrawal, J Olsen - 2011 International Conference …, 2011 - ieeexplore.ieee.org

This paper describes the collection of a text and audio corpus for mobile personal
communication in Hindi. Hindi is the largest of the Indian languages, and is the first …

被引用次数：10 相关文章所有 3 个版本

Development of Text and Speech Corpus for Designing the Multilingual Recognition System

S Bansal, SS Agrawal - 2018 Oriental COCOSDA-International …, 2018 - ieeexplore.ieee.org

To create the multilingual speech and text corpus manually is very difficult and time-
consuming task. This paper presents the overall methodology and experiences of text and …

被引用次数：5 相关文章

Corpus design and development of an annotated speech database for Punjabi

S Bansal, S Sharan, SS Agrawal - … Oriental COCOSDA held …, 2015 - ieeexplore.ieee.org

Punjabi is an important Indo-Aryan languages spoken in India and in some other countries
especially Pakistan. It is a tonal language and its phonetic and phonological aspects have …

被引用次数：4 相关文章所有 2 个版本

[PDF] academia.edu

[PDF][PDF] SAMPA for Hindi and Punjabi based on their Acoustic and Phonetic Characteristics [C]

KK Arora, S Arora, SR Singla… - Proc. International Oriental …, 2007 - academia.edu

Abstract SAMPA (Speech Assessment Methods Phonetic Alphabet) is a machine readable
phonetic alphabet and hence facilitates easy processing of data for many applications in …

被引用次数：5 相关文章

[PDF] elra.info

[PDF][PDF] Multilingual Crowdsourcing Methodology for Developing Resources for Under-resourced Indian Languages

KK Arora, S Arora, MK Roy, SS Agrawal - lt4all.elra.info

Huge Data collection challenge gets intensified for Under Resources Languages especially
for large variety of Indian languages. We propose building of a common framework for …

[PDF][PDF] A review on selection and correction of text and speech data for Indian languages

H Bharad, T Kodinariya - ETCEE–2015, 2015 - academia.edu

Today the word is going towards hands-free interfacing with machine using speech
commands and/or speech recognition. The computer understands binary or machine level …