- 学术资源搜索

Pengi: An audio language model for audio tasks

S Deshmukh, B Elizalde, R Singh… - Advances in Neural …, 2023 - proceedings.neurips.cc

In the domain of audio processing, Transfer Learning has facilitated the rise of Self-
Supervised Learning and Zero-Shot Learning techniques. These approaches have led to …

被引用次数：54 相关文章所有 5 个版本

Pvass-mdd: predictive visual-audio alignment self-supervision for multimodal deepfake detection

Y Yu, X Liu, R Ni, S Yang, Y Zhao… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Deepfake techniques can forge the visual or audio signals in the video, which leads to
inconsistencies between visual and audio (VA) signals. Therefore, multimodal detection …

被引用次数：8 相关文章

Self-labeling with feature transfer for speech emotion recognition

G Wen, H Liao, H Li, P Wen, T Zhang, S Gao… - Knowledge-Based …, 2022 - Elsevier

Most speech emotion recognition methods based on frames have obtained good results in
many applications. However, they segment each speech sample into smaller frames that are …

被引用次数：11 相关文章所有 2 个版本

[PDF] arxiv.org

Interpreting glottal flow dynamics for detecting covid-19 from voice

S Deshmukh, M Al Ismail… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org

In the pathogenesis of COVID-19, impairment of respiratory functions is often one of the key
symptoms. Studies show that in these cases, voice production is also adversely affected …

被引用次数：37 相关文章所有 8 个版本

[PDF] epa.gov

A multi-modal wildfire prediction and early-warning system based on a novel machine learning framework

RT Bhowmik, YS Jung, JA Aguilera, M Prunicki… - Journal of environmental …, 2023 - Elsevier

Wildfires are increasingly impacting the environment and human health. Among the top 20
California wildfires, those in 2020–2021 burned more acres than the last century combined …

被引用次数：9 相关文章所有 8 个版本

[PDF] arxiv.org

Audio retrieval with wavtext5k and clap training

S Deshmukh, B Elizalde, H Wang - arXiv preprint arXiv:2209.14275, 2022 - arxiv.org

Audio-Text retrieval takes a natural language query to retrieve relevant audio files in a
database. Conversely, Text-Audio retrieval takes an audio file as a query to retrieve relevant …

被引用次数：39 相关文章所有 4 个版本

[HTML] mdpi.com

[HTML][HTML] A personalized respiratory disease exacerbation prediction technique based on a novel spatio-temporal machine learning architecture and local …

RT Bhowmik, SP Most - Electronics, 2022 - mdpi.com

Chronic respiratory diseases, such as the Chronic Obstructive Pulmonary Disease (COPD)
and asthma, are a serious health crisis, affecting a large number of people globally and …

被引用次数：10 相关文章所有 3 个版本

[HTML] mdpi.com

[HTML][HTML] Weakly supervised u-net with limited upsampling for sound event detection

S Lee, H Kim, GJ Jang - Applied Sciences, 2023 - mdpi.com

Featured Application Audio classification; music information retrieval; audio scene
characterization; temporal localization of sound sources; audio indexing; audio surveillance …

被引用次数：2 相关文章所有 4 个版本

[PDF] arxiv.org

Sound event detection guided by semantic contexts of scenes

N Tonami, K Imoto, R Nagase… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Some studies have revealed that contexts of scenes (eg," home,"" office," and" cooking") are
advantageous for sound event detection (SED). Mobile devices and sensing technologies …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

A multi-modal wildfire prediction and personalized early-warning system based on a novel machine learning framework

RT Bhowmik - arXiv preprint arXiv:2208.09079, 2022 - arxiv.org

Wildfires are increasingly impacting the environment, human health and safety. Among the
top 20 California wildfires, those in 2020-2021 burned more acres than the last century …

被引用次数：2 相关文章