Improving RNN transducer modeling for end-to-end speech recognition
In the last few years, an emerging trend in automatic speech recognition research is the
study of end-to-end (E2E) systems. Connectionist Temporal Classification (CTC), Attention …
study of end-to-end (E2E) systems. Connectionist Temporal Classification (CTC), Attention …
On the gap between domestic robotic applications and computational intelligence
Aspired to build intelligent agents that can assist humans in daily life, researchers and
engineers, both from academia and industry, have kept advancing the state-of-the-art in …
engineers, both from academia and industry, have kept advancing the state-of-the-art in …
Cough recognition based on mel-spectrogram and convolutional neural network
Q Zhou, J Shan, W Ding, C Wang, S Yuan… - Frontiers in Robotics …, 2021 - frontiersin.org
In daily life, there are a variety of complex sound sources. It is important to effectively detect
certain sounds in some situations. With the outbreak of COVID-19, it is necessary to …
certain sounds in some situations. With the outbreak of COVID-19, it is necessary to …
Deep-FSMN for large vocabulary continuous speech recognition
In this paper, we present an improved feedforward sequential memory networks (FSMN)
architecture, namely Deep-FSMN (DFSMN), by introducing skip connections between …
architecture, namely Deep-FSMN (DFSMN), by introducing skip connections between …
End-to-end neural systems for automatic children speech recognition: An empirical study
PG Shivakumar, S Narayanan - Computer Speech & Language, 2022 - Elsevier
A key desiderata for inclusive and accessible speech recognition technology is ensuring its
robust performance to children's speech. Notably, this includes the rapidly advancing neural …
robust performance to children's speech. Notably, this includes the rapidly advancing neural …
Temporal modeling using dilated convolution and gating for voice-activity-detection
Voice activity detection (VAD) is the task of predicting which parts of an utterance contains
speech versus background noise. It is an important first step to determine which samples to …
speech versus background noise. It is an important first step to determine which samples to …
The CAPIO 2017 conversational speech recognition system
In this paper we show how we have achieved the state-of-the-art performance on the
industry-standard NIST 2000 Hub5 English evaluation set. We explore densely connected …
industry-standard NIST 2000 Hub5 English evaluation set. We explore densely connected …
Application of a Hybrid Bi-LSTM-CRF model to the task of Russian Named Entity Recognition
Abstract Named Entity Recognition (NER) is one of the most common tasks of the natural
language processing. The purpose of NER is to find and classify tokens in text documents …
language processing. The purpose of NER is to find and classify tokens in text documents …
A Systematic Review on Semantic Role Labeling for Information Extraction in Low-Resource Data
Challenges in the big data phenomenon arise due to the existence of unstructured text data,
which is very large, comes from various sources, has various formats, and contains much …
which is very large, comes from various sources, has various formats, and contains much …
Text normalization with convolutional neural networks
Text normalization is a critical step in the variety of tasks involving speech and language
technologies. It is one of the vital components of natural language processing, text-to …
technologies. It is one of the vital components of natural language processing, text-to …