- 学术资源搜索

Exploring the integration of large language models into automatic speech recognition systems: An empirical study

Z Min, J Wang - International Conference on Neural Information …, 2023 - Springer

This paper explores the integration of Large Language Models (LLMs) into Automatic
Speech Recognition (ASR) systems to improve transcription accuracy. The increasing …

被引用次数：26 相关文章所有 3 个版本

[PDF] arxiv.org

Improving large-scale deep biasing with phoneme features and text-only data in streaming transducer

J Qiu, L Huang, B Li, J Zhang, L Lu… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

Deep biasing for the Transducer can improve the recognition performance of rare words or
contextual entities, which is essential in practical applications, especially for streaming …

被引用次数：3 相关文章所有 3 个版本

[PDF] researchgate.net

Contextual Spelling Correction with Large Language Models

G Song, Z Wu, G Pundak, A Chandorkar… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

Contextual Spelling Correction (CSC) models are used to improve automatic speech
recognition (ASR) quality given userspecific context. Typically, context is modeled as a large …

被引用次数：4 相关文章

[PDF] springer.com

Server-side rescoring of spoken entity-centric knowledge queries for virtual assistants

Y Zhang, S Gondala, T Fraga-Silva… - International Journal of …, 2024 - Springer

On-device virtual assistants (VAs) powered by automatic speech recognition (ASR) require
effective knowledge integration for the challenging entity-rich query recognition. In this …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Integrating lattice-free MMI into end-to-end speech recognition

J Tian, J Yu, C Weng, Y Zou… - IEEE/ACM Transactions on …, 2022 - ieeexplore.ieee.org

In automatic speech recognition (ASR) research, discriminative criteria have achieved
superior performance in DNN-HMM systems. Given this success, the adoption of …

被引用次数：10 相关文章所有 5 个版本

[PDF] arxiv.org

O-1: Self-training with Oracle and 1-best Hypothesis

MK Baskar, A Rosenberg, B Ramabhadran… - arXiv preprint arXiv …, 2023 - arxiv.org

We introduce O-1, a new self-training objective to reduce training bias and unify training and
evaluation metrics for speech recognition. O-1 is a faster variant of Expected Minimum …

Effective internal language model training and fusion for factorized transducer model

J Guo, N Moritz, Y Ma, F Seide, C Wu… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

The internal language model (ILM) of the neural transducer has been widely studied. In most
prior work, it is mainly used for estimating the ILM score and is subsequently subtracted …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Correction Focused Language Model Training For Speech Recognition

Y Ma, Z Liu, O Kalinli - ICASSP 2024-2024 IEEE International …, 2024 - ieeexplore.ieee.org

Language models (LMs) have been commonly adopted to boost the performance of
automatic speech recognition (ASR) particularly in domain adaptation tasks. Conventional …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Improving Rare Words Recognition through Homophone Extension and Unified Writing for Low-resource Cantonese Speech Recognition

HL Chung, J Li, P Liu, WK Leung… - … on Chinese Spoken …, 2022 - ieeexplore.ieee.org

Homophone characters are common in tonal syllable-based languages, such as Mandarin
and Cantonese. The data-intensive end-to-end Automatic Speech Recognition (ASR) …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Spelling Correction through Rewriting of Non-Autoregressive ASR Lattices

L Velikovich, C Li, D Caseiro, S Kumar… - arXiv preprint arXiv …, 2024 - arxiv.org

For end-to-end Automatic Speech Recognition (ASR) models, recognizing personal or rare
phrases can be hard. A promising way to improve accuracy is through spelling correction (or …