- 学术资源搜索

Spoken instruction understanding in air traffic control: Challenge, technique, and application

Y Lin - Aerospace, 2021 - mdpi.com

In air traffic control (ATC), speech communication with radio transmission is the primary way
to exchange information between the controller and aircrew. A wealth of contextual …

被引用次数：71 相关文章所有 10 个版本

[PDF] arxiv.org

wav2vec: Unsupervised pre-training for speech recognition

S Schneider, A Baevski, R Collobert, M Auli - arXiv preprint arXiv …, 2019 - arxiv.org

We explore unsupervised pre-training for speech recognition by learning representations of
raw audio. wav2vec is trained on large amounts of unlabeled audio data and the resulting …

被引用次数：1575 相关文章所有 12 个版本

[PDF] acm.org

Mer 2023: Multi-label learning, modality robustness, and semi-supervised learning

Z Lian, H Sun, L Sun, K Chen, M Xu, K Wang… - Proceedings of the 31st …, 2023 - dl.acm.org

The first Multimodal Emotion Recognition Challenge (MER 2023) 1 was successfully held at
ACM Multimedia. The challenge focuses on system robustness and consists of three distinct …

被引用次数：41 相关文章所有 5 个版本

[PDF] google.com

Smin: Semi-supervised multi-modal interaction network for conversational emotion recognition

Z Lian, B Liu, J Tao - IEEE Transactions on Affective Computing, 2022 - ieeexplore.ieee.org

Conversational emotion recognition is a crucial research topic in human-computer
interactions. Due to the heavy annotation cost and inevitable label ambiguity, collecting …

被引用次数：55 相关文章所有 4 个版本

[PDF] google.com

Multimodal cross-and self-attention network for speech emotion recognition

L Sun, B Liu, J Tao, Z Lian - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org

Speech Emotion Recognition (SER) requires a thorough understanding of both the linguistic
content of an utterance (ie, textual information) and how the speaker utters it (ie, acoustic …

被引用次数：73 相关文章所有 3 个版本

[PDF] arxiv.org

Improving transformer-based speech recognition using unsupervised pre-training

D Jiang, X Lei, W Li, N Luo, Y Hu, W Zou… - arXiv preprint arXiv …, 2019 - arxiv.org

Speech recognition technologies are gaining enormous popularity in various industrial
applications. However, building a good speech recognition system usually requires large …

被引用次数：101 相关文章所有 4 个版本

[PDF] arxiv.org

Contrastive unsupervised learning for speech emotion recognition

M Li, B Yang, J Levy, A Stolcke… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

Speech emotion recognition (SER) is a key technology to enable more natural human-
machine communication. However, SER has long suffered from a lack of public large-scale …

被引用次数：59 相关文章所有 5 个版本

[PDF] arxiv.org

Improving speech recognition models with small samples for air traffic control systems

Y Lin, Q Li, B Yang, Z Yan, H Tan, Z Chen - Neurocomputing, 2021 - Elsevier

In the domain of air traffic control (ATC) systems, efforts to train a practical automatic speech
recognition (ASR) model always faces the problem of small training samples since the …

被引用次数：41 相关文章所有 4 个版本

[PDF] arxiv.org

Speech-XLNet: Unsupervised acoustic model pretraining for self-attention networks

X Song, G Wang, Z Wu, Y Huang, D Su, D Yu… - arXiv preprint arXiv …, 2019 - arxiv.org

Self-attention network (SAN) can benefit significantly from the bi-directional representation
learning through unsupervised pretraining paradigms such as BERT and XLNet. In this …

被引用次数：56 相关文章所有 7 个版本

Tdfnet: Transformer-based deep-scale fusion network for multimodal emotion recognition

Z Zhao, Y Wang, G Shen, Y Xu… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org

As deep learning technology research continues to progress, artificial intelligence
technology is gradually empowering various fields. To achieve a more natural human …

被引用次数：7 相关文章所有 2 个版本