EMMeTT: Efficient Multimodal Machine Translation Training

P Żelasko, Z Chen, M Wang, D Galvez… - arXiv preprint arXiv …, 2024 - arxiv.org
A rising interest in the modality extension of foundation language models warrants
discussion on the most effective, and efficient, multimodal training approach. This work …

Language Model Can Listen While Speaking

Z Ma, Y Song, C Du, J Cong, Z Chen, Y Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Dialogue serves as the most natural manner of human-computer interaction (HCI). Recent
advancements in speech language models (SLM) have significantly enhanced speech …