The CMU SPHINX-4 speech recognition system

S Latif, R Rana, S Khalifa, R Jurdak, J Qadir… - arXiv preprint arXiv …, 2020 - arxiv.org

Research on speech processing has traditionally considered the task of designing hand-
engineered acoustic features (feature engineering) as a separate distinct problem from the …

被引用次数：98 相关文章所有 3 个版本

[PDF] inaoep.mx

Automatic speech recognition: a survey

M Malik, MK Malik, K Mehmood… - Multimedia Tools and …, 2021 - Springer

Recently great strides have been made in the field of automatic speech recognition (ASR) by
using various deep learning techniques. In this study, we present a thorough comparison …

被引用次数：273 相关文章所有 8 个版本

[PDF] ieee.org

Deep speaker recognition: Process, progress, and challenges

AQ Ohi, MF Mridha, MA Hamid, MM Monowar - IEEE Access, 2021 - ieeexplore.ieee.org

Speaker recognition is related to human biometrics dealing with the identification of
speakers from their speech. Speaker recognition is an active research area and being …

被引用次数：34 相关文章所有 5 个版本

[PDF] acm.org

Learning speech-driven 3d conversational gestures from video

I Habibie, W Xu, D Mehta, L Liu, HP Seidel… - Proceedings of the 21st …, 2021 - dl.acm.org

We propose the first approach to synthesize the synchronous 3D conversational body and
hand gestures, as well as 3D face and head animations, of a virtual character from speech …

被引用次数：86 相关文章所有 10 个版本

[PDF] arxiv.org

Ctc-segmentation of large corpora for german end-to-end speech recognition

L Kürzinger, D Winkelbauer, L Li, T Watzel… - … Conference on Speech …, 2020 - Springer

Recent end-to-end Automatic Speech Recognition (ASR) systems demonstrated the ability
to outperform conventional hybrid DNN/HMM ASR. Aside from architectural improvements in …

被引用次数：102 相关文章所有 7 个版本

[PDF] springer.com

A thorough evaluation of the Language Environment Analysis (LENA) system

A Cristia, M Lavechin, C Scaff, M Soderstrom… - Behavior research …, 2021 - Springer

In the previous decade, dozens of studies involving thousands of children across several
research disciplines have made use of a combined daylong audio-recorder and automated …

被引用次数：108 相关文章所有 25 个版本

[PDF] acm.org

Decision-making for bidirectional communication in sequential human-robot collaborative tasks

VV Unhelkar, S Li, JA Shah - Proceedings of the 2020 ACM/IEEE …, 2020 - dl.acm.org

Communication is critical to collaboration; however, too much of it can degrade
performance. Motivated by the need for effective use of a robot's communication modalities …

被引用次数：91 相关文章所有 7 个版本

[PDF] hal.science

SPPAS-multi-lingual approaches to the automatic annotation of speech

B Bigi - The Phonetician. Journal of the International Society of …, 2015 - hal.science

The first step of most acoustic analyses unavoidably involves the alignment of recorded
speech sounds with their phonetic annotation. This step is very labor-intensive and cost …

被引用次数：168 相关文章所有 6 个版本

[PDF] arxiv.org

Challenges of language technologies for the indigenous languages of the Americas

M Mager, X Gutierrez-Vasques, G Sierra… - arXiv preprint arXiv …, 2018 - arxiv.org

Indigenous languages of the American continent are highly diverse. However, they have
received little attention from the technological perspective. In this paper, we review the …

被引用次数：112 相关文章所有 3 个版本

[PDF] microsoft.com

Semanticpaint: Interactive 3d labeling and learning at your fingertips

J Valentin, V Vineet, MM Cheng, D Kim… - ACM Transactions on …, 2015 - dl.acm.org

We present a new interactive and online approach to 3D scene understanding. Our system,
SemanticPaint, allows users to simultaneously scan their environment whilst interactively …

被引用次数：136 相关文章所有 17 个版本