Experimenting with 1d cnn architectures for generic audio classification

[HTML][HTML] Semi-supervised machine condition monitoring by learning deep discriminative audio features

I Thoidis, M Giouvanakis, G Papanikolaou - Electronics, 2021 - mdpi.com

In this study, we aim to learn highly descriptive representations for a wide set of machinery
sounds and exploit this knowledge to perform condition monitoring of mechanical …

被引用次数：14 相关文章所有 4 个版本

Audiovisual speaker indexing for Web-TV automations

N Vryzas, L Vrysis, C Dimoulas - Expert Systems with Applications, 2021 - Elsevier

The current paper introduces a multimodal framework to provide Web-TV automations for
live broadcasting and overall big streaming data management. The term indexing refers to …

被引用次数：12 相关文章所有 3 个版本

[HTML] mdpi.com

[HTML][HTML] A prototype web application to support human-centered audiovisual content authentication and crowdsourcing

N Vryzas, A Katsaounidou, L Vrysis, R Kotsakis… - Future Internet, 2022 - mdpi.com

Media authentication relies on the detection of inconsistencies that may indicate malicious
editing in audio and video files. Traditionally, authentication processes are performed by …

被引用次数：8 相关文章所有 8 个版本

[HTML] mdpi.com

[HTML][HTML] Semantic crowdsourcing of soundscapes heritage: a mojo model for data-driven storytelling

ME Stamatiadou, I Thoidis, N Vryzas, L Vrysis… - Sustainability, 2021 - mdpi.com

The current paper focuses on the development of an enhanced Mobile Journalism (MoJo)
model for soundscape heritage crowdsourcing, data-driven storytelling, and management in …

被引用次数：13 相关文章所有 7 个版本

Enhanced Temporal Feature Integration in Audio Semantics via Alpha-Stable Modeling

L Vrysis, L Hadjileontiadis, I Thoidis, C Dimoulas… - Journal of the Audio …, 2021 - aes.org

Modern feature-based methodologies in semantic audio applications attempt to capture the
temporal dependency of successive feature observations, which form the so-called texture …

被引用次数：15 相关文章

[PDF] arxiv.org

Enhanced Speech Emotion Recognition with Efficient Channel Attention Guided Deep CNN-BiLSTM Framework

NK Kundu, S Kobir, MR Ahmed, T Aktar… - arXiv preprint arXiv …, 2024 - arxiv.org

Speech emotion recognition (SER) is crucial for enhancing affective computing and
enriching the domain of human-computer interaction. However, the main challenge in SER …

A citizen science approach to support joint air quality and noise monitoring in urban areas

ME Stamatiadou, N Vryzas, L Vrysis, T Saridou… - … Society Convention 152, 2022 - aes.org

In the present work, a crowdsourcing approach is designed, to investigate the correlation
between air and noise pollution in urban areas. Citizens are requested to provide air quality …

被引用次数：5 相关文章所有 3 个版本

[HTML] mdpi.com

[HTML][HTML] Temporal auditory coding features for causal speech enhancement

I Thoidis, L Vrysis, D Markou, G Papanikolaou - Electronics, 2020 - mdpi.com

Perceptually motivated audio signal processing and feature extraction have played a key
role in the determination of high-level semantic processes and the development of emerging …

被引用次数：6 相关文章所有 5 个版本

Noise invariant feature pooling for the internet of audio things

C Nalmpantis, L Vrysis, D Vlachava… - Multimedia Tools and …, 2022 - Springer

This manuscript discusses the robustness to noise of deep learning models for two audio
classification tasks. The first task is a speaker recognition application, trying to identify five …

被引用次数：1 相关文章所有 5 个版本

[PDF] arxiv.org

PERSA+: A Deep Learning Front-End for Context-Agnostic Audio Classification

L Vrysis, I Thoidis, C Dimoulas… - arXiv preprint arXiv …, 2021 - arxiv.org

Deep learning has been applied to diverse audio semantics tasks, enabling the construction
of models that learn hierarchical levels of features from high-dimensional raw data …