Deep representation learning in speech processing: Challenges, recent advances, and future trends

S Latif, R Rana, S Khalifa, R Jurdak, J Qadir… - arXiv preprint arXiv …, 2020 - arxiv.org
Research on speech processing has traditionally considered the task of designing hand-
engineered acoustic features (feature engineering) as a separate distinct problem from the …

Automatic speech recognition: a survey

M Malik, MK Malik, K Mehmood… - Multimedia Tools and …, 2021 - Springer
Recently great strides have been made in the field of automatic speech recognition (ASR) by
using various deep learning techniques. In this study, we present a thorough comparison …

Deep speaker recognition: Process, progress, and challenges

AQ Ohi, MF Mridha, MA Hamid, MM Monowar - IEEE Access, 2021 - ieeexplore.ieee.org
Speaker recognition is related to human biometrics dealing with the identification of
speakers from their speech. Speaker recognition is an active research area and being …

Learning speech-driven 3d conversational gestures from video

I Habibie, W Xu, D Mehta, L Liu, HP Seidel… - Proceedings of the 21st …, 2021 - dl.acm.org
We propose the first approach to synthesize the synchronous 3D conversational body and
hand gestures, as well as 3D face and head animations, of a virtual character from speech …

Ctc-segmentation of large corpora for german end-to-end speech recognition

L Kürzinger, D Winkelbauer, L Li, T Watzel… - … Conference on Speech …, 2020 - Springer
Recent end-to-end Automatic Speech Recognition (ASR) systems demonstrated the ability
to outperform conventional hybrid DNN/HMM ASR. Aside from architectural improvements in …

A thorough evaluation of the Language Environment Analysis (LENA) system

A Cristia, M Lavechin, C Scaff, M Soderstrom… - Behavior research …, 2021 - Springer
In the previous decade, dozens of studies involving thousands of children across several
research disciplines have made use of a combined daylong audio-recorder and automated …

Decision-making for bidirectional communication in sequential human-robot collaborative tasks

VV Unhelkar, S Li, JA Shah - Proceedings of the 2020 ACM/IEEE …, 2020 - dl.acm.org
Communication is critical to collaboration; however, too much of it can degrade
performance. Motivated by the need for effective use of a robot's communication modalities …

SPPAS-multi-lingual approaches to the automatic annotation of speech

B Bigi - The Phonetician. Journal of the International Society of …, 2015 - hal.science
The first step of most acoustic analyses unavoidably involves the alignment of recorded
speech sounds with their phonetic annotation. This step is very labor-intensive and cost …

Challenges of language technologies for the indigenous languages of the Americas

M Mager, X Gutierrez-Vasques, G Sierra… - arXiv preprint arXiv …, 2018 - arxiv.org
Indigenous languages of the American continent are highly diverse. However, they have
received little attention from the technological perspective. In this paper, we review the …

Semanticpaint: Interactive 3d labeling and learning at your fingertips

J Valentin, V Vineet, MM Cheng, D Kim… - ACM Transactions on …, 2015 - dl.acm.org
We present a new interactive and online approach to 3D scene understanding. Our system,
SemanticPaint, allows users to simultaneously scan their environment whilst interactively …