Speaker identification through artificial intelligence techniques: A comprehensive review and research challenges

R Jahangir, YW Teh, HF Nweke, G Mujtaba… - Expert Systems with …, 2021 - Elsevier
Speech is a powerful medium of communication that always convey rich and useful
information, such as gender, accent, and other unique characteristics of a speaker. These …

A survey of music emotion recognition

D Han, Y Kong, J Han, G Wang - Frontiers of Computer Science, 2022 - Springer
Music is the language of emotions. In recent years, music emotion recognition has attracted
widespread attention in the academic and industrial community since it can be widely used …

Opensmile: the munich versatile and fast open-source audio feature extractor

F Eyben, M Wöllmer, B Schuller - Proceedings of the 18th ACM …, 2010 - dl.acm.org
We introduce the openSMILE feature extraction toolkit, which unites feature extraction
algorithms from the speech processing and the Music Information Retrieval communities …

Essentia: An audio analysis library for music information retrieval

D Bogdanov, N Wack, E Gómez Gutiérrez… - Britto A, Gouyon F …, 2013 - repositori.upf.edu
We present Essentia 2.0, an open-source C++ library for audio analysis and audio-based
music information retrieval released under the Affero GPL license. It contains an extensive …

[PDF][PDF] A Matlab toolbox for musical feature extraction from audio

O Lartillot, P Toiviainen - International conference on digital audio effects, 2007 - dafx.de
We present MIRtoolbox, an integrated set of functions written in Matlab, dedicated to the
extraction of musical features from audio files. The design is based on a modular framework …

Towards an intelligent framework for multimodal affective data analysis

S Poria, E Cambria, A Hussain, GB Huang - Neural Networks, 2015 - Elsevier
An increasingly large amount of multimodal content is posted on social media websites such
as YouTube and Facebook everyday. In order to cope with the growth of such so much …

MHTN: Modal-adversarial hybrid transfer network for cross-modal retrieval

X Huang, Y Peng, M Yuan - IEEE transactions on cybernetics, 2018 - ieeexplore.ieee.org
Cross-modal retrieval has drawn wide interest for retrieval across different modalities (such
as text, image, video, audio, and 3-D model). However, existing methods based on a deep …

[PDF][PDF] YAAFE, an Easy to Use and Efficient Audio Feature Extraction Software.

B Mathieu, S Essid, T Fillon, J Prado, G Richard - ISMIR, 2010 - Citeseer
ABSTRACT Music Information Retrieval systems are commonly built on a feature extraction
stage. For applications involving automatic classification (eg speech/music discrimination …

Multi-pathway generative adversarial hashing for unsupervised cross-modal retrieval

J Zhang, Y Peng - IEEE Transactions on Multimedia, 2019 - ieeexplore.ieee.org
Cross-modal hashing aims to map heterogeneous cross-modal data into a common
Hamming space, which can realize fast and flexible retrieval across different modalities …

Ensemble learning of hybrid acoustic features for speech emotion recognition

K Zvarevashe, O Olugbara - Algorithms, 2020 - mdpi.com
Automatic recognition of emotion is important for facilitating seamless interactivity between a
human being and intelligent robot towards the full realization of a smart society. The …