Deep representation learning in speech processing: Challenges, recent advances, and future trends
Research on speech processing has traditionally considered the task of designing hand-
engineered acoustic features (feature engineering) as a separate distinct problem from the …
engineered acoustic features (feature engineering) as a separate distinct problem from the …
Automatic speech recognition: a survey
Recently great strides have been made in the field of automatic speech recognition (ASR) by
using various deep learning techniques. In this study, we present a thorough comparison …
using various deep learning techniques. In this study, we present a thorough comparison …
Deep speaker recognition: Process, progress, and challenges
Speaker recognition is related to human biometrics dealing with the identification of
speakers from their speech. Speaker recognition is an active research area and being …
speakers from their speech. Speaker recognition is an active research area and being …
Learning speech-driven 3d conversational gestures from video
We propose the first approach to synthesize the synchronous 3D conversational body and
hand gestures, as well as 3D face and head animations, of a virtual character from speech …
hand gestures, as well as 3D face and head animations, of a virtual character from speech …
Ctc-segmentation of large corpora for german end-to-end speech recognition
L Kürzinger, D Winkelbauer, L Li, T Watzel… - … Conference on Speech …, 2020 - Springer
Recent end-to-end Automatic Speech Recognition (ASR) systems demonstrated the ability
to outperform conventional hybrid DNN/HMM ASR. Aside from architectural improvements in …
to outperform conventional hybrid DNN/HMM ASR. Aside from architectural improvements in …
A thorough evaluation of the Language Environment Analysis (LENA) system
In the previous decade, dozens of studies involving thousands of children across several
research disciplines have made use of a combined daylong audio-recorder and automated …
research disciplines have made use of a combined daylong audio-recorder and automated …
Decision-making for bidirectional communication in sequential human-robot collaborative tasks
Communication is critical to collaboration; however, too much of it can degrade
performance. Motivated by the need for effective use of a robot's communication modalities …
performance. Motivated by the need for effective use of a robot's communication modalities …
SPPAS-multi-lingual approaches to the automatic annotation of speech
B Bigi - The Phonetician. Journal of the International Society of …, 2015 - hal.science
The first step of most acoustic analyses unavoidably involves the alignment of recorded
speech sounds with their phonetic annotation. This step is very labor-intensive and cost …
speech sounds with their phonetic annotation. This step is very labor-intensive and cost …
Challenges of language technologies for the indigenous languages of the Americas
Indigenous languages of the American continent are highly diverse. However, they have
received little attention from the technological perspective. In this paper, we review the …
received little attention from the technological perspective. In this paper, we review the …
Semanticpaint: Interactive 3d labeling and learning at your fingertips
We present a new interactive and online approach to 3D scene understanding. Our system,
SemanticPaint, allows users to simultaneously scan their environment whilst interactively …
SemanticPaint, allows users to simultaneously scan their environment whilst interactively …