- 学术资源搜索

Findings of the IWSLT 2022 Evaluation Campaign.

A Anastasopoulos, L Barrault, L Bentivogli… - Proceedings of the 19th …, 2022 - cris.fbk.eu

The evaluation campaign of the 19th International Conference on Spoken Language
Translation featured eight shared tasks:(i) Simultaneous speech translation,(ii) Offline …

被引用次数：97 相关文章所有 17 个版本

[PDF] arxiv.org

The multilingual tedx corpus for speech recognition and translation

E Salesky, M Wiesner, J Bremerman, R Cattoni… - arXiv preprint arXiv …, 2021 - arxiv.org

We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and
speech translation (ST) research across many non-English source languages. The corpus is …

被引用次数：124 相关文章所有 12 个版本

[PDF] wiley.com Full View

Understanding the brain with attention: A survey of transformers in brain sciences

C Chen, H Wang, Y Chen, Z Yin, X Yang, H Ning… - Brain‐X, 2023 - Wiley Online Library

Owing to their superior capabilities and advanced achievements, Transformers have
gradually attracted attention with regard to understanding complex brain processing …

被引用次数：5 相关文章

[PDF] arxiv.org

Adaptive multilingual speech recognition with pretrained models

NQ Pham, A Waibel, J Niehues - arXiv preprint arXiv:2205.12304, 2022 - arxiv.org

Multilingual speech recognition with supervised learning has achieved great results as
reflected in recent research. With the development of pretraining methods on audio and text …

被引用次数：27 相关文章所有 3 个版本

[PDF] thecvf.com

AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation

J Choi, SJ Park, M Kim, YM Ro - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

This paper proposes a novel direct Audio-Visual Speech to Audio-Visual Speech
Translation (AV2AV) framework where the input and output of the system are multimodal (ie …

被引用次数：2 相关文章所有 4 个版本

[PDF] mlr.press

From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers

K Choromanski, H Lin, H Chen… - International …, 2022 - proceedings.mlr.press

In this paper we provide, to the best of our knowledge, the first comprehensive approach for
incorporating various masking mechanisms into Transformers architectures in a scalable …

被引用次数：23 相关文章所有 6 个版本

[PDF] ieee.org

Incorporating relative position information in transformer-based sign language recognition and translation

N Aloysius, M Geetha, P Nedungadi - IEEE Access, 2021 - ieeexplore.ieee.org

Recent advancements in machine translation tasks, with the advent of attention mechanisms
and Transformer networks, have accelerated the research in Sign Language Translation …

被引用次数：26 相关文章所有 3 个版本

[HTML] mdpi.com

[HTML][HTML] A reverse positional encoding multi-head attention-based neural machine translation model for arabic dialects

LH Baniata, S Kang, IKE Ampomah - Mathematics, 2022 - mdpi.com

Languages with a grammatical structure that have a free order for words, such as Arabic
dialects, are considered a challenge for neural machine translation (NMT) models because …

被引用次数：13 相关文章所有 7 个版本

[PDF] arxiv.org

ESPnet-ST IWSLT 2021 offline speech translation system

H Inaguma, B Yan, S Dalmia, P Guo, J Shi… - arXiv preprint arXiv …, 2021 - arxiv.org

This paper describes the ESPnet-ST group's IWSLT 2021 submission in the offline speech
translation track. This year we made various efforts on training data, architecture, and audio …

被引用次数：20 相关文章所有 7 个版本

[PDF] arxiv.org

Variable attention masking for configurable transformer transducer speech recognition

P Swietojanski, S Braun, D Can… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

This work studies the use of attention masking in transformer transducer based speech
recognition for building a single configurable model for different deployment scenarios. We …

被引用次数：9 相关文章所有 3 个版本