A multi-purpose audio-visual corpus for multi-modal Persian speech recognition: The Arman-AV dataset

Transfer learning-based nonstationary traffic flow prediction using AdaRNN and DCORAL

L Zang, T Wang, B Zhang, C Li - Expert Systems with Applications, 2024 - Elsevier

Traffic flow prediction is an integral part of an intelligent transportation system (ITS) for
proactive transportation planning and management in public transit network systems …

被引用次数：3 相关文章所有 2 个版本

Audio–visual speech recognition based on regulated transformer and spatio–temporal fusion strategy for driver assistive systems

D Ryumin, A Axyonov, E Ryumina, D Ivanko… - Expert Systems with …, 2024 - Elsevier

This article presents a research methodology for audio–visual speech recognition (AVSR) in
driver assistive systems. These systems necessitate ongoing interaction with drivers while …

被引用次数：11 相关文章

PSscheduler: A parameter synchronization scheduling algorithm for distributed machine learning in reconfigurable optical networks

L Liu, X Xu, P Zhou, X Chen, D Ergu, H Yu, G Sun… - Neurocomputing, 2025 - Elsevier

With the increasing size of training datasets and models, parameter synchronization stage
puts a heavy burden on the network, and communication has become one of the main …

Speech Recognition for Intelligent System in Service Robots: A Review

R Atika, S Dwijayanti… - … Conference on Electrical …, 2024 - ieeexplore.ieee.org

Speech recognition and response system technology in service robot research continues to
evolve alongside technical advances and the increasing demand for intelligent automation …

[PDF] arxiv.org

ManaTTS Persian: a recipe for creating TTS datasets for lower resource languages

MF Qharabagh, Z Dehghanian, HR Rabiee - arXiv preprint arXiv …, 2024 - arxiv.org

In this study, we introduce ManaTTS, the most extensive publicly accessible single-speaker
Persian corpus, and a comprehensive framework for collecting transcribed speech datasets …

被引用次数：1 相关文章

[PDF] arxiv.org

AnnoTheia: A Semi-Automatic Annotation Toolkit for Audio-Visual Speech Technologies

JM Acosta-Triana, D Gimeno-Gómez… - arXiv preprint arXiv …, 2024 - arxiv.org

More than 7,000 known languages are spoken around the world. However, due to the lack
of annotated resources, only a small fraction of them are currently covered by speech …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org