Amharic spoken digits recognition using convolutional neural network

TA Ayall, C Zhou, H Liu, GM Brhanemeskel, ST Abate… - Journal of Big Data, 2024 - Springer
Spoken digits recognition (SDR) is a type of supervised automatic speech recognition, which
is required in various human–machine interaction applications. It is utilized in phone-based …

Cross-Corpus Multilingual Speech Emotion Recognition: Amharic vs. Other Languages

EA Retta, R Sutcliffe, J Mahmood, MA Berwo… - Applied Sciences, 2023 - mdpi.com
In a conventional speech emotion recognition (SER) task, a classifier for a given language is
trained on a pre-existing dataset for that same language. However, where training data for a …

[HTML][HTML] MPSA-Conformer-CTC/Attention: A High-Accuracy, Low-Complexity End-to-End Approach for Tibetan Speech Recognition

C Wu, H Sun, K Huang, L Wu - Sensors, 2024 - mdpi.com
This study addresses the challenges of low accuracy and high computational demands in
Tibetan speech recognition by investigating the application of end-to-end networks. We …

MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation

L Zheng, Y Zhang, H Guo, J Pan, Z Tan, J Lu… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advances in video diffusion models have unlocked new potential for realistic audio-
driven talking video generation. However, achieving seamless audio-lip synchronization …

EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark

Z Ma, M Chen, H Zhang, Z Zheng, W Chen, X Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Speech emotion recognition (SER) is an important part of human-computer interaction,
receiving extensive attention from both industry and academia. However, the current …

Speech-Based Emotion Analysis Using Log-Mel Spectrograms and MFCC Features

AK Yetkin, H Köse - 2023 31st Signal Processing and …, 2023 - ieeexplore.ieee.org
This study proposes a method for recognizing emotions from speech using Mel
spectrograms and MFCC features which capture the spectral features of speech signals. To …

[PDF][PDF] A SPECTROGRAM IMAGE AND SPECTRAL COEFFICIENTS ASSISTED VECTOR QUANTIZATION FOR AMHARIC SPEECH EMOTION RECOGNITION

AB DAGIM - 2024 - ir.bdu.edu.et
Speech emotion recognition (SER) is focused on predicting emotion from speech data,
irrespective of the semantic contents. However, it is pertinent to note that variability in …

Dialect-Based Noisy Speech Dataset, Pre-Processing Tools, and Recognition Models for Amharic

TT Assfaw, T Abebe, B Yalew… - … on Information and …, 2022 - ieeexplore.ieee.org
Nowadays, to facilitate deep learning for voice recognition, a large number of dialect-based
speech dataset for different languages have been produced. However, not such more …

Speech Emotion Recognition: Recent Advances and Current Trends

S Belkacem - International Conference on Artificial Intelligence and …, 2023 - Springer
Speech emotion recognition has become an important endeavor in Human-computer-
Interaction. Emotion recognition and understanding by computer are used for medical …

MEMO: Memory-Guided and Emotion-Aware Talking Video Generation

L Zheng, Y Zhang, HA Guo, J Pan, Z Tan, J Lu, C Tang… - openreview.net
Advances in video diffusion models have unlocked the potential for realistic audio-driven
talking video generation. However, it is still highly challenging to ensure seamless audio-lip …