Acoustic modeling with deep neural networks using raw time signal for LVCSR

R Prabhavalkar, T Hori, TN Sainath… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …

被引用次数：88 相关文章所有 6 个版本

[PDF] arxiv.org

Deep learning for audio signal processing

H Purwins, B Li, T Virtanen, J Schlüter… - IEEE Journal of …, 2019 - ieeexplore.ieee.org

Given the recent surge in developments of deep learning, this paper provides a review of the
state-of-the-art deep learning techniques for audio signal processing. Speech, music, and …

被引用次数：827 相关文章所有 7 个版本

[PDF] researchgate.net

Speaker recognition from raw waveform with sincnet

M Ravanelli, Y Bengio - 2018 IEEE spoken language …, 2018 - ieeexplore.ieee.org

Deep learning is progressively gaining popularity as a viable alternative to i-vectors for
speaker recognition. Promising results have been recently obtained with Convolutional …

被引用次数：900 相关文章所有 10 个版本

[PDF] academia.edu

[PDF][PDF] Wavenet: A generative model for raw audio

A Van Den Oord, S Dieleman, H Zen… - arXiv preprint arXiv …, 2016 - academia.edu

This paper introduces WaveNet, a deep neural network for generating raw audio waveforms.
The model is fully probabilistic and autoregressive, with the predictive distribution for each …

被引用次数：5525 相关文章所有 10 个版本

[PDF] arxiv.org

Wavenet: A generative model for raw audio

A Oord, S Dieleman, H Zen, K Simonyan… - arXiv preprint arXiv …, 2016 - arxiv.org

This paper introduces WaveNet, a deep neural network for generating raw audio waveforms.
The model is fully probabilistic and autoregressive, with the predictive distribution for each …

被引用次数：1740 相关文章所有 2 个版本

[PDF] arxiv.org

Unsupervised speech representation learning using wavenet autoencoders

J Chorowski, RJ Weiss, S Bengio… - … /ACM transactions on …, 2019 - ieeexplore.ieee.org

We consider the task of unsupervised extraction of meaningful latent representations of
speech by applying autoencoding neural networks to speech waveforms. The goal is to …

被引用次数：390 相关文章所有 11 个版本

[PDF] academia.edu

[图书][B] Automatic speech recognition

D Yu, L Deng - 2016 - Springer

Automatic Speech Recognition (ASR), which is aimed to enable natural human–machine
interaction, has been an intensive research area for decades. Many core technologies, such …

被引用次数：1522 相关文章所有 9 个版本

[PDF] researchgate.net

Very deep convolutional neural networks for raw waveforms

W Dai, C Dai, S Qu, J Li, S Das - 2017 IEEE international …, 2017 - ieeexplore.ieee.org

Learning acoustic models directly from the raw waveform data with minimal processing is
challenging. Current waveform-based models have generally used very few (~ 2) …

被引用次数：490 相关文章所有 7 个版本

[PDF] isca-archive.org

[PDF][PDF] Learning the speech front-end with raw waveform CLDNNs.

TN Sainath, RJ Weiss, AW Senior, KW Wilson… - Interspeech, 2015 - isca-archive.org

Learning an acoustic model directly from the raw waveform has been an active area of
research. However, waveformbased models have not yet matched the performance of …

被引用次数：618 相关文章所有 10 个版本

[PDF] wiley.com Full View

The discriminative lexicon: A unified computational model for the lexicon and lexical processing in comprehension and production grounded not in (de) composition …

RH Baayen, YY Chuang, E Shafaei-Bajestan… - …, 2019 - Wiley Online Library

The discriminative lexicon is introduced as a mathematical and computational model of the
mental lexicon. This novel theory is inspired by word and paradigm morphology but …

被引用次数：230 相关文章所有 26 个版本