Contributions of temporal cue on the perception of speaker individuality and vocal emotion...

Speech emotion recognition using 3d convolutions and attention-based sliding recurrent networks with auditory front-ends

Z Peng, X Li, Z Zhu, M Unoki, J Dang, M Akagi - IEEE Access, 2020 - ieeexplore.ieee.org

Emotion information from speech can effectively help robots understand speaker's intentions
in natural human-robot interaction. The human auditory system can easily track temporal …

被引用次数：72 相关文章所有 6 个版本

Multi-resolution modulation-filtered cochleagram feature for LSTM-based dimensional emotion recognition from speech

Z Peng, J Dang, M Unoki, M Akagi - Neural Networks, 2021 - Elsevier

Continuous dimensional emotion recognition from speech helps robots or virtual agents
capture the temporal dynamics of a speaker's emotional state in natural human–robot …

被引用次数：46 相关文章所有 4 个版本

[PDF] researchgate.net

Robust voice activity detection using an auditory-inspired masked modulation encoder based convolutional attention network

N Li, L Wang, M Ge, M Unoki, S Li, J Dang - Speech Communication, 2024 - Elsevier

Deep learning has revolutionized voice activity detection (VAD) by offering promising
solutions. However, directly applying traditional features, such as raw waveforms and Mel …

被引用次数：3 相关文章所有 4 个版本

[PDF] jst.go.jp

Relationship between contributions of temporal amplitude envelope of speech and modulation transfer function in room acoustics to perception of noise-vocoded …

M Unoki, Z Zhu - Acoustical Science and Technology, 2020 - jstage.jst.go.jp

Speech signals can be represented as a sum of amplitude-modulated frequency bands. This
sum can also be regarded as a temporal amplitude envelope (TAE) with temporal fine …

被引用次数：19 相关文章所有 3 个版本

Envelope estimation using geometric properties of a discrete real signal

CHT Santos, V Pereira - Digital Signal Processing, 2022 - Elsevier

Despite being an elusive concept, the temporal amplitude envelope of a signal is essential
for its complete characterization, being the primary information-carrying medium in spoken …

被引用次数：9 相关文章所有 3 个版本

[PDF] sciencedirect.com

Increasing speech intelligibility and naturalness in noise based on concepts of modulation spectrum and modulation transfer function

T Ngo, R Kubo, M Akagi - Speech Communication, 2021 - Elsevier

This study focuses on identifying effective features for controlling speech to increase speech
intelligibility under adverse conditions. Previous approaches either cancel noise throughout …

被引用次数：8 相关文章所有 6 个版本

[PDF] jst.go.jp

Contribution of modulation spectral features on the perception of vocal-emotion using noise-vocoded speech

Z Zhu, R Miyauchi, Y Araki, M Unoki - Acoustical Science and …, 2018 - jstage.jst.go.jp

Previous studies on noise-vocoded speech showed that the temporal modulation cues
provided by the temporal envelope play an important role in the perception of vocal emotion …

被引用次数：16 相关文章所有 2 个版本

[HTML] mdpi.com

[HTML][HTML] Enhancing Dimensional Emotion Recognition from Speech through Modulation-Filtered Cochleagram and Parallel Attention Recurrent Network

Z Peng, H Zeng, Y Li, Y Du, J Dang - Electronics, 2023 - mdpi.com

Dimensional emotion can better describe rich and fine-grained emotional states than
categorical emotion. In the realm of human–robot interaction, the ability to continuously …

[HTML][HTML] Contribution of common modulation spectral features to vocal-emotion recognition of noise-vocoded speech in noisy reverberant environments

T Guo, Z Zhu, S Kidani, M Unoki - Applied Sciences, 2022 - mdpi.com

In one study on vocal emotion recognition using noise-vocoded speech (NVS), the high
similarities between modulation spectral features (MSFs) and the results of vocal-emotion …

被引用次数：2 相关文章所有 6 个版本

[PDF] researchgate.net

A study of salient modulation domain features for speaker identification

SW McKnight, AOT Hogg, VW Neo… - 2021 Asia-Pacific …, 2021 - ieeexplore.ieee.org

This paper studies the ranges of acoustic and modulation frequencies of speech most
relevant for identifying speakers and compares the speaker-specific information present in …

被引用次数：4 相关文章所有 4 个版本