SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

L Barrault, YA Chung, MC Meglioli, D Dale… - arXiv preprint arXiv …, 2023 - arxiv.org
What does it take to create the Babel Fish, a tool that can help individuals translate speech
between any two languages? While recent breakthroughs in text-based models have …

Seamless: Multilingual Expressive and Streaming Speech Translation

L Barrault, YA Chung, MC Meglioli, D Dale… - arXiv preprint arXiv …, 2023 - arxiv.org
Large-scale automatic speech translation systems today lack key features that help machine-
mediated communication feel seamless when compared to human-to-human dialogue. In …

Multilingual large language model: A survey of resources, taxonomy and frontiers

L Qin, Q Chen, Y Zhou, Z Chen, Y Li, L Liao… - arXiv preprint arXiv …, 2024 - arxiv.org
Multilingual Large Language Models are capable of using powerful Large Language
Models to handle and respond to queries in multiple languages, which achieves remarkable …

Multi-resolution HuBERT: Multi-resolution speech self-supervised learning with masked unit prediction

J Shi, H Inaguma, X Ma, I Kulikov, A Sun - arXiv preprint arXiv:2310.02720, 2023 - arxiv.org
Existing Self-Supervised Learning (SSL) models for speech typically process speech signals
at a fixed resolution of 20 milliseconds. This approach overlooks the varying informational …

Salm: Speech-augmented language model with in-context learning for speech recognition and translation

Z Chen, H Huang, A Andrusenko… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
We present a novel Speech Augmented Language Model (SALM) with multitask and in-
context learning capabilities. SALM comprises a frozen text LLM, a audio encoder, a …

Evaluating multilingual speech translation under realistic conditions with resegmentation and terminology

E Salesky, K Darwish, M Al-Badrashiny… - Proceedings of the …, 2023 - aclanthology.org
We present the ACL 60/60 evaluation sets for multilingual translation of ACL 2022 technical
presentations into 10 target languages. This dataset enables further research into …

QUESPA Submission for the IWSLT 2024 Dialectal and Low-resource Speech Translation Task

JE Ortega, RJ Zevallos, IS Ahmad… - Proceedings of the 21st …, 2024 - aclanthology.org
This article describes the QUESPA team speech translation (ST) submissions for the
Quechua to Spanish (QUE–SPA) track featured in the Evaluation Campaign of IWSLT 2024 …

Evaluating self-supervised speech representations for indigenous American languages

CC Chen, W Chen, R Zevallos, J Ortega - arXiv preprint arXiv:2310.03639, 2023 - arxiv.org
The application of self-supervision to speech representation learning has garnered
significant interest in recent years, due to its scalability to large amounts of unlabeled data …

NAVER LABS Europe's Multilingual Speech Translation Systems for the IWSLT 2023 Low-Resource Track

E Gow-Smith, A Berard, MZ Boito… - arXiv preprint arXiv …, 2023 - arxiv.org
This paper presents NAVER LABS Europe's systems for Tamasheq-French and Quechua-
Spanish speech translation in the IWSLT 2023 Low-Resource track. Our work attempts to …

PolyVoice: Language Models for Speech to Speech Translation

Q qian Dong, Z Huang, Q Tian, C Xu, T Ko… - The Twelfth …, 2023 - openreview.net
With the huge success of GPT models in natural language processing, there is a growing
interest in applying language modeling approaches to speech tasks. Currently, the dominant …