Audio-visual feature fusion via deep neural networks for automatic speech recognition

AS Dhanjal, W Singh - Multimedia Tools and Applications, 2024 - Springer

The continuous development in Automatic Speech Recognition has grown and
demonstrated its enormous potential in Human Interaction Communication systems. It is …

被引用次数：38 相关文章

[PDF] calebrascon.info

Trends in audio signal feature extraction methods

G Sharma, K Umapathy, S Krishnan - Applied Acoustics, 2020 - Elsevier

Audio signal processing algorithms generally involves analysis of signal, extracting its
properties, predicting its behaviour, recognizing if any pattern is present in the signal, and …

被引用次数：470 相关文章所有 3 个版本

A review on classifying abnormal behavior in crowd scene

AA Afiq, MA Zakariya, MN Saad, AA Nurfarzana… - Journal of Visual …, 2019 - Elsevier

Crowd behavior analysis has become one of the new areas of interest in the computer vision
community due to the increasing demands from surveillance and security industries. It is …

被引用次数：66 相关文章所有 5 个版本

[PDF] mdpi.com

Detecting respiratory pathologies using convolutional neural networks and variational autoencoders for unbalancing data

MT García-Ordás, JA Benítez-Andrades… - Sensors, 2020 - mdpi.com

The aim of this paper was the detection of pathologies through respiratory sounds. The
ICBHI (International Conference on Biomedical and Health Informatics) Benchmark was …

被引用次数：123 相关文章所有 11 个版本

[HTML] sciencedirect.com

[HTML][HTML] Smart home security solutions using facial authentication and speaker recognition through artificial neural networks

N Saxena, D Varshney - International Journal of Cognitive Computing in …, 2021 - Elsevier

In this paper, a holistic solution for Smart Home Security is implemented which helps in
improving privacy and security using two independent and emerging technologies of facial …

被引用次数：55 相关文章所有 2 个版本

[PDF] researchgate.net

Fast evaluation of crack growth path using time series forecasting

DTT Do, J Lee, H Nguyen-Xuan - Engineering Fracture Mechanics, 2019 - Elsevier

This paper aims at forecasting the crack propagation in risk assessment of engineering
structures based on time series algorithms named “long short-term memory” and “multi-layer …

被引用次数：51 相关文章所有 3 个版本

[PDF] iasj.net

[PDF][PDF] A review on voice-based interface for human-robot interaction

AA Badr, AK Abdul-Hassan - Iraqi Journal for Electrical and Electronic …, 2020 - iasj.net

With the recent developments of technology and the advances in artificial intelligence and
machine learning techniques, it has become possible for the robot to understand and …

被引用次数：35 相关文章所有 9 个版本

[PDF] ieee.org

Attention-block deep learning based features fusion in wearable social sensor for mental wellbeing evaluations

J Jin, B Gao, S Yang, B Zhao, L Luo, WL Woo - Ieee Access, 2020 - ieeexplore.ieee.org

With the progressive increase of stress, anxiety and depression in working and living
environment, mental health assessment becomes an important social interaction research …

被引用次数：42 相关文章所有 8 个版本

[PDF] purdue.edu

A large-scale uav audio dataset and audio-based uav classification using cnn

Y Wang, Z Chu, I Ku, EC Smith… - 2022 Sixth IEEE …, 2022 - ieeexplore.ieee.org

The increased popularity and accessibility of UAVs may create potential threats.
Researchers have been developing UAV detection and classification systems with different …

被引用次数：16 相关文章所有 7 个版本

[PDF] springer.com

Bimodal variational autoencoder for audiovisual speech recognition

HM Sayed, HE ElDeeb, SA Taie - Machine Learning, 2023 - Springer

Multimodal fusion is the idea of combining information in a joint representation of multiple
modalities. The goal of multimodal fusion is to improve the accuracy of results from …

被引用次数：15 相关文章所有 6 个版本