End-to-end speech-to-text translation: A survey

N Sethiya, CK Maurya - Computer Speech & Language, 2024 - Elsevier
Abstract Speech-to-Text (ST) translation pertains to the task of converting speech signals in
one language to text in another language. It finds its application in various domains, such as …

Reducing interpretative ambiguity in an educational environment with ChatGPT

F Garcia-Varela, Z Bekerman, M Nussbaum… - Computers & …, 2025 - Elsevier
The study posits that both concrete and abstract words are crucial for effective
communication, particularly in educational contexts where the interplay between these forms …

Large language model based generative error correction: A challenge and baselines for speech recognition, speaker tagging, and emotion recognition

CHH Yang, T Park, Y Gong, Y Li, Z Chen, YT Lin… - arXiv preprint arXiv …, 2024 - arxiv.org
Given recent advances in generative AI technology, a key question is how large language
models (LLMs) can enhance acoustic modeling tasks using text decoding results from a …

Advancing holistic educational goals through generative language-based technologies

M Nussbaum, Z Bekerman - Learning and Individual Differences, 2025 - Elsevier
We explore the transformative potential of generative language-based technologies in
educational reform, moving beyond traditional cognitive transmission towards a more …

Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent

S Cheng, Z Huang, T Ko, H Li, N Peng, L Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
In this paper, we present Cross Language Agent--Simultaneous Interpretation, CLASI, a
high-quality and human-like Simultaneous Speech Translation (SiST) System. Inspired by …

A Survey on Speech Large Language Models

J Peng, Y Wang, Y Xi, X Li, K Yu - arXiv preprint arXiv:2410.18908, 2024 - arxiv.org
Large Language Models (LLMs) exhibit strong contextual understanding and remarkable
multi-task performance. Therefore, researchers have been seeking to integrate LLMs in the …

Chain-of-Thought Prompting for Speech Translation

K Hu, Z Chen, CHH Yang, P Żelasko… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) have demonstrated remarkable advancements in language
understanding and generation. Building on the success of text-based LLMs, recent research …

CoT-ST: Enhancing LLM-based Speech Translation with Multimodal Chain-of-Thought

Y Du, Z Ma, Y Yang, K Deng, X Chen, B Yang… - arXiv preprint arXiv …, 2024 - arxiv.org
Speech Language Models (SLMs) have demonstrated impressive performance on speech
translation tasks. However, existing research primarily focuses on direct instruction fine …

HW-TSC's Submission to the CCMT 2024 Machine Translation Tasks

Z Wu, Y Luo, D Wei, J Zheng, B Wei, Z Li… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper presents the submission of Huawei Translation Services Center (HW-TSC) to
machine translation tasks of the 20th China Conference on Machine Translation (CCMT …

Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024

S Koneru, TB Nguyen, NQ Pham, D Liu, Z Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) are currently under exploration for various tasks, including
Automatic Speech Recognition (ASR), Machine Translation (MT), and even End-to-End …