Optimizing bilingual neural transducer with synthetic code-switching text generation

文章

学术资源搜索

获得 4 条结果（用时0.02秒）

我的图书馆

Optimizing bilingual neural transducer with synthetic code-switching text generation

在引用文章中搜索

[PDF] arxiv.org

Prompting the hidden talent of web-scale speech models for zero-shot task generalization

P Peng, B Yan, S Watanabe, D Harwath - arXiv preprint arXiv:2305.11095, 2023 - arxiv.org

We investigate the emergent abilities of the recently proposed web-scale speech model
Whisper, by adapting it to unseen tasks with prompt engineering. We selected three tasks …

被引用次数：31 相关文章所有 7 个版本

[PDF] arxiv.org

Approximate nearest neighbour phrase mining for contextual speech recognition

M Bleeker, P Swietojanski, S Braun… - arXiv preprint arXiv …, 2023 - arxiv.org

This paper presents an extension to train end-to-end Context-Aware Transformer
Transducer (CATT) models by using a simple, yet efficient method of mining hard negative …

被引用次数：5 相关文章所有 5 个版本

[PDF] arxiv.org

LAE-ST-MOE: Boosted Language-Aware Encoder Using Speech Translation Auxiliary Task for E2E Code-Switching ASR

G Ma, W Wang, Y Li, Y Yang, B Du… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

Recently, to mitigate the confusion between different languages in code-switching (CS)
automatic speech recognition (ASR), the conditionally factorized models, such as the …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Cross-lingual Knowledge Transfer and Iterative Pseudo-labeling for Low-Resource Speech Recognition with Transducers

J Silovsky, L Deng, A Argueta, T Arvizo, R Hsiao… - arXiv preprint arXiv …, 2023 - arxiv.org

Voice technology has become ubiquitous recently. However, the accuracy, and hence
experience, in different languages varies significantly, which makes the technology not …