The AMI system for the transcription of speech in meetings

S Latif, R Rana, S Khalifa, R Jurdak, J Qadir… - arXiv preprint arXiv …, 2020 - arxiv.org

Research on speech processing has traditionally considered the task of designing hand-
engineered acoustic features (feature engineering) as a separate distinct problem from the …

被引用次数：112 相关文章所有 3 个版本

[PDF] arxiv.org

ESPnet: End-to-end speech processing toolkit

S Watanabe, T Hori, S Karita, T Hayashi… - arXiv preprint arXiv …, 2018 - arxiv.org

This paper introduces a new open source platform for end-to-end speech processing named
ESPnet. ESPnet mainly focuses on end-to-end automatic speech recognition (ASR), and …

被引用次数：1702 相关文章所有 15 个版本

[PDF] arxiv.org

Thchs-30: A free chinese speech corpus

D Wang, X Zhang - arXiv preprint arXiv:1512.01882, 2015 - arxiv.org

Speech data is crucially important for speech recognition research. There are quite some
speech databases that can be purchased at prices that are reasonable for most research …

被引用次数：276 相关文章所有 6 个版本

[PDF] becpg.fr

Data quality: The other face of big data

B Saha, D Srivastava - 2014 IEEE 30th international conference …, 2014 - ieeexplore.ieee.org

In our Big Data era, data is being generated, collected and analyzed at an unprecedented
scale, and data-driven decision making is sweeping through all aspects of society. Recent …

被引用次数：345 相关文章所有 5 个版本

[PDF] arxiv.org

Building and evaluation of a real room impulse response dataset

I Szöke, M Skácel, L Mošner, J Paliesek… - IEEE Journal of …, 2019 - ieeexplore.ieee.org

This paper presents BUT ReverbDB-a dataset of real room impulse responses (RIR),
background noises, and retransmitted speech data. The retransmitted data include …

被引用次数：159 相关文章所有 5 个版本

[PDF] arxiv.org

Dover-lap: A method for combining overlap-aware diarization outputs

D Raj, LP Garcia-Perera, Z Huang… - 2021 IEEE Spoken …, 2021 - ieeexplore.ieee.org

Several advances have been made recently towards handling overlapping speech for
speaker diarization. Since speech and natural language tasks often benefit from ensemble …

被引用次数：83 相关文章所有 10 个版本

[PDF] hal.science

The CALO meeting assistant system

G Tur, A Stolcke, L Voss, S Peters… - … on Audio, Speech …, 2010 - ieeexplore.ieee.org

The CALO Meeting Assistant (MA) provides for distributed meeting capture, annotation,
automatic transcription and semantic analysis of multiparty meetings, and is part of the larger …

被引用次数：250 相关文章所有 30 个版本

[PDF] merl.com

Unified architecture for multichannel end-to-end speech recognition with neural beamforming

T Ochiai, S Watanabe, T Hori… - IEEE Journal of …, 2017 - ieeexplore.ieee.org

This paper proposes a unified architecture for end-to-end automatic speech recognition
(ASR) to encompass microphone-array signal processing such as a state-of-the-art neural …

被引用次数：106 相关文章所有 7 个版本

[PDF] ed.ac.uk

Recognition and understanding of meetings the AMI and AMIDA projects

S Renals, T Hain, H Bourlard - 2007 IEEE Workshop on …, 2007 - ieeexplore.ieee.org

The AMI and AMIDA projects are concerned with the recognition and interpretation of
multiparty meetings. Within these projects we have: developed an infrastructure for …

被引用次数：191 相关文章所有 21 个版本

[PDF] mlr.press

Multichannel end-to-end speech recognition

T Ochiai, S Watanabe, T Hori… - … conference on machine …, 2017 - proceedings.mlr.press

The field of speech recognition is in the midst of a paradigm shift: end-to-end neural
networks are challenging the dominance of hidden Markov models as a core technology …

被引用次数：125 相关文章所有 14 个版本