Bytes are all you need: End-to-end multilingual speech recognition and synthesis with bytes

B Li, Y Zhang, T Sainath, Y Wu… - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
We present two end-to-end models: Audio-to-Byte (A2B) and Byte-to-Audio (B2A), for
multilingual speech recognition and synthesis. Prior work has predominantly used …

Phonetisaurus: Exploring grapheme-to-phoneme conversion with joint n-gram models in the WFST framework

JR Novak, N Minematsu, K Hirose - Natural Language Engineering, 2016 - cambridge.org
This paper provides an analysis of several practical issues related to the theory and
implementation of Grapheme-to-Phoneme (G2P) conversion systems utilizing the Weighted …

Large vocabulary Russian speech recognition using syntactico-statistical language modeling

A Karpov, K Markov, I Kipyatkova, D Vazhenina… - Speech …, 2014 - Elsevier
Speech is the most natural way of human communication and in order to achieve convenient
and efficient human–computer interaction implementation of state-of-the-art spoken …

Unicode-based graphemic systems for limited resource languages

MJF Gales, KM Knill, A Ragni - 2015 IEEE International …, 2015 - ieeexplore.ieee.org
Large vocabulary continuous speech recognition systems require a mapping from words, or
tokens, into sub-word units to enable robust estimation of acoustic model parameters, and to …

Аналитический обзор систем распознавания русской речи с большим словарем

ИС Кипяткова, АА Карпов - Информатика и …, 2010 - proceedings.spiiras.nw.ru
Аннотация Использование большого словаря необходимо для задачи
стенографирования флективных языков, поскольку эти языки характеризуются …

From speech to letters-using a novel neural network architecture for grapheme based ASR

F Eyben, M Wöllmer, B Schuller… - 2009 IEEE Workshop on …, 2009 - ieeexplore.ieee.org
Main-stream automatic speech recognition systems are based on modelling acoustic sub-
word units such as phonemes. Phonemisation dictionaries and language model based …

Computational intelligence in processing of speech acoustics: a survey

A Singh, N Kaur, V Kukreja, V Kadyan… - Complex & Intelligent …, 2022 - Springer
Speech recognition of a language is a key area in the field of pattern recognition. This paper
presents a comprehensive survey on the speech recognition techniques for non-Indian and …

[图书][B] Автоматическая обработка разговорной русской речи

В монографии очерчен круг проблем, связанных с особенностями автоматического
анализа разговорной русской речи в интерактивных диалоговых системах. Описаны …

Cross-lingual automatic speech recognition using tandem features

P Lal, S King - IEEE Transactions on Audio, Speech, and …, 2013 - ieeexplore.ieee.org
Automatic speech recognition depends on large amounts of transcribed speech recordings
in order to estimate the parameters of the acoustic model. Recording such large speech …

Towards the creation of reliable voice control system based on a fuzzy approach

AV Savchenko, LV Savchenko - Pattern Recognition Letters, 2015 - Elsevier
The key purpose of this paper is to train a voice control system if a small amount of user
speech data is available without need for general acoustic model if the latter does not fit to …