The IBM 2007 speech transcription system for European parliamentary speeches

H Soltau, G Saon, B Kingsbury - 2010 IEEE Spoken Language …, 2010 - ieeexplore.ieee.org

We describe the design of IBM's Attila speech recognition toolkit. We show how the
combination of a highly modular and efficient library of low-level C++ classes with simple …

被引用次数：171 相关文章所有 8 个版本

[PDF] kit.edu

Simultaneous translation of lectures and speeches

C Fügen, A Waibel, M Kolss - Machine translation, 2007 - Springer

With increasing globalization, communication across language and cultural boundaries is
becoming an essential requirement of doing business, delivering education, and providing …

被引用次数：165 相关文章所有 19 个版本

[PDF] googleapis.com

Identifying keyword occurrences in audio data

VN Gupta, G Boulianne - US Patent 8,423,363, 2013 - Google Patents

Occurrences of one or more keywords in audio data are identified using a speech
recognizer employing a language model to derive a transcript of the keywords. The …

被引用次数：95 相关文章所有 4 个版本

[PDF] googleapis.com

Decoding-time prediction of non-verbalized tokens

J Fritsch, A Deoras, D Koll - US Patent 8,918,317, 2014 - Google Patents

Non-verbalized tokens, such as punctuation, are automatically predicted and inserted into a
transcription of speech in which the tokens were not explicitly verbalized. Token prediction …

被引用次数：51 相关文章所有 4 个版本

[PDF] psu.edu

[PDF][PDF] Uncertainty decoding for noise robust speech recognition

H Liao, MJF Gales - 2009 - Citeseer

It is well known that the performance of automatic speech recognition degrades in noisy
conditions. To address this, typically the noise is removed from the features or the models …

被引用次数：74 相关文章所有 11 个版本

[PDF] isca-archive.org

[PDF][PDF] Bag-of-word normalized n-gram models.

A Sethy, B Ramabhadran - INTERSPEECH, 2008 - isca-archive.org

Abstract The Bag-Of-Word (BOW) model uses a fixed length vector of word counts to
represent text. Although the model disregards word sequence information, it has been …

被引用次数：45 相关文章所有 4 个版本

[PDF] researchgate.net

An iterative relative entropy minimization-based data selection approach for n-gram model adaptation

A Sethy, PG Georgiou, B Ramabhadran… - IEEE transactions on …, 2009 - ieeexplore.ieee.org

Performance of statistical n-gram language models depends heavily on the amount of
training text material and the degree to which the training text matches the domain of …

被引用次数：44 相关文章所有 8 个版本

[PDF] ieee.org

End-to-end speech endpoint detection utilizing acoustic and language modeling knowledge for online low-latency speech recognition

I Hwang, JH Chang - IEEE access, 2020 - ieeexplore.ieee.org

Speech endpoint detection (EPD) benefits from the decoder state features (DSFs) of online
automatic speech recognition (ASR) system. However, the DSFs are obtained via the ASR …

被引用次数：9 相关文章所有 5 个版本

[PDF] aaai.org

Transcription system using automatic speech recognition for the Japanese Parliament (Diet)

T Kawahara - Proceedings of the AAAI Conference on Artificial …, 2012 - ojs.aaai.org

This article describes a new automatic transcription system in the Japanese Parliament
which deploys our automatic speech recognition (ASR) technology. To achieve high …

被引用次数：33 相关文章所有 15 个版本

[PDF] kyoto-u.ac.jp

Statistical transformation of language and pronunciation models for spontaneous speech recognition

Y Akita, T Kawahara - IEEE Transactions on Audio, Speech …, 2009 - ieeexplore.ieee.org

We propose a novel approach based on a statistical transformation framework for language
and pronunciation modeling of spontaneous speech. Since it is not practical to train a …

被引用次数：31 相关文章所有 13 个版本