ElasticAST: An Audio Spectrogram Transformer for All Length and Resolutions

J Feng, MH Erol, JS Chung, A Senocak - arXiv preprint arXiv:2407.08691, 2024 - arxiv.org
Transformers have rapidly overtaken CNN-based architectures as the new standard in audio
classification. Transformer-based models, such as the Audio Spectrogram Transformers …

From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers

J Feng, MH Erol, JS Chung… - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
Transformers have become central to recent advances in audio classification. However,
training an audio spectrogram transformer, eg AST, from scratch can be resource and time …

Mixer is more than just a model

Q Ji, Y Wang, L Sun - arXiv preprint arXiv:2402.18007, 2024 - arxiv.org
Recently, MLP structures have regained popularity, with MLP-Mixer standing out as a
prominent example. In the field of computer vision, MLP-Mixer is noted for its ability to extract …