Speaker identification through artificial intelligence techniques: A comprehensive review and research challenges
Speech is a powerful medium of communication that always convey rich and useful
information, such as gender, accent, and other unique characteristics of a speaker. These …
information, such as gender, accent, and other unique characteristics of a speaker. These …
A survey of music emotion recognition
D Han, Y Kong, J Han, G Wang - Frontiers of Computer Science, 2022 - Springer
Music is the language of emotions. In recent years, music emotion recognition has attracted
widespread attention in the academic and industrial community since it can be widely used …
widespread attention in the academic and industrial community since it can be widely used …
Opensmile: the munich versatile and fast open-source audio feature extractor
We introduce the openSMILE feature extraction toolkit, which unites feature extraction
algorithms from the speech processing and the Music Information Retrieval communities …
algorithms from the speech processing and the Music Information Retrieval communities …
Essentia: An audio analysis library for music information retrieval
D Bogdanov, N Wack, E Gómez Gutiérrez… - Britto A, Gouyon F …, 2013 - repositori.upf.edu
We present Essentia 2.0, an open-source C++ library for audio analysis and audio-based
music information retrieval released under the Affero GPL license. It contains an extensive …
music information retrieval released under the Affero GPL license. It contains an extensive …
[PDF][PDF] A Matlab toolbox for musical feature extraction from audio
O Lartillot, P Toiviainen - International conference on digital audio effects, 2007 - dafx.de
We present MIRtoolbox, an integrated set of functions written in Matlab, dedicated to the
extraction of musical features from audio files. The design is based on a modular framework …
extraction of musical features from audio files. The design is based on a modular framework …
Towards an intelligent framework for multimodal affective data analysis
An increasingly large amount of multimodal content is posted on social media websites such
as YouTube and Facebook everyday. In order to cope with the growth of such so much …
as YouTube and Facebook everyday. In order to cope with the growth of such so much …
MHTN: Modal-adversarial hybrid transfer network for cross-modal retrieval
Cross-modal retrieval has drawn wide interest for retrieval across different modalities (such
as text, image, video, audio, and 3-D model). However, existing methods based on a deep …
as text, image, video, audio, and 3-D model). However, existing methods based on a deep …
[PDF][PDF] YAAFE, an Easy to Use and Efficient Audio Feature Extraction Software.
ABSTRACT Music Information Retrieval systems are commonly built on a feature extraction
stage. For applications involving automatic classification (eg speech/music discrimination …
stage. For applications involving automatic classification (eg speech/music discrimination …
Multi-pathway generative adversarial hashing for unsupervised cross-modal retrieval
Cross-modal hashing aims to map heterogeneous cross-modal data into a common
Hamming space, which can realize fast and flexible retrieval across different modalities …
Hamming space, which can realize fast and flexible retrieval across different modalities …
Ensemble learning of hybrid acoustic features for speech emotion recognition
K Zvarevashe, O Olugbara - Algorithms, 2020 - mdpi.com
Automatic recognition of emotion is important for facilitating seamless interactivity between a
human being and intelligent robot towards the full realization of a smart society. The …
human being and intelligent robot towards the full realization of a smart society. The …