Improving RNN transducer modeling for end-to-end speech recognition

J Li, R Zhao, H Hu, Y Gong - 2019 IEEE Automatic Speech …, 2019 - ieeexplore.ieee.org
In the last few years, an emerging trend in automatic speech recognition research is the
study of end-to-end (E2E) systems. Connectionist Temporal Classification (CTC), Attention …

On the gap between domestic robotic applications and computational intelligence

J Zhong, C Ling, A Cangelosi, A Lotfi, X Liu - Electronics, 2021 - mdpi.com
Aspired to build intelligent agents that can assist humans in daily life, researchers and
engineers, both from academia and industry, have kept advancing the state-of-the-art in …

Cough recognition based on mel-spectrogram and convolutional neural network

Q Zhou, J Shan, W Ding, C Wang, S Yuan… - Frontiers in Robotics …, 2021 - frontiersin.org
In daily life, there are a variety of complex sound sources. It is important to effectively detect
certain sounds in some situations. With the outbreak of COVID-19, it is necessary to …

Deep-FSMN for large vocabulary continuous speech recognition

S Zhang, M Lei, Z Yan, L Dai - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
In this paper, we present an improved feedforward sequential memory networks (FSMN)
architecture, namely Deep-FSMN (DFSMN), by introducing skip connections between …

End-to-end neural systems for automatic children speech recognition: An empirical study

PG Shivakumar, S Narayanan - Computer Speech & Language, 2022 - Elsevier
A key desiderata for inclusive and accessible speech recognition technology is ensuring its
robust performance to children's speech. Notably, this includes the rapidly advancing neural …

Temporal modeling using dilated convolution and gating for voice-activity-detection

SY Chang, B Li, G Simko, TN Sainath… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org
Voice activity detection (VAD) is the task of predicting which parts of an utterance contains
speech versus background noise. It is an important first step to determine which samples to …

The CAPIO 2017 conversational speech recognition system

KJ Han, A Chandrashekaran, J Kim, I Lane - arXiv preprint arXiv …, 2017 - arxiv.org
In this paper we show how we have achieved the state-of-the-art performance on the
industry-standard NIST 2000 Hub5 English evaluation set. We explore densely connected …

Application of a Hybrid Bi-LSTM-CRF model to the task of Russian Named Entity Recognition

TA Le, MY Arkhipov, MS Burtsev - … 20–23, 2017, Revised Selected Papers …, 2018 - Springer
Abstract Named Entity Recognition (NER) is one of the most common tasks of the natural
language processing. The purpose of NER is to find and classify tokens in text documents …

A Systematic Review on Semantic Role Labeling for Information Extraction in Low-Resource Data

ADP Ariyanto, D Purwitasari, C Fatichah - IEEE Access, 2024 - ieeexplore.ieee.org
Challenges in the big data phenomenon arise due to the existence of unstructured text data,
which is very large, comes from various sources, has various formats, and contains much …

Text normalization with convolutional neural networks

S Yolchuyeva, G Németh, B Gyires-Tóth - International Journal of Speech …, 2018 - Springer
Text normalization is a critical step in the variety of tasks involving speech and language
technologies. It is one of the vital components of natural language processing, text-to …