A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Large-scale multi-modal pre-trained models: A comprehensive survey

X Wang, G Chen, G Qian, P Gao, XY Wei… - Machine Intelligence …, 2023 - Springer
With the urgent demand for generalized deep models, many pre-trained big models are
proposed, such as bidirectional encoder representations (BERT), vision transformer (ViT) …

Multimodal learning with transformers: A survey

P Xu, X Zhu, DA Clifton - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Transformer is a promising neural network learner, and has achieved great success in
various machine learning tasks. Thanks to the recent prevalence of multimodal applications …

Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting

Y Zhang, J Yan - The eleventh international conference on learning …, 2023 - openreview.net
Recently many deep models have been proposed for multivariate time series (MTS)
forecasting. In particular, Transformer-based models have shown great potential because …

Recipe for a general, powerful, scalable graph transformer

L Rampášek, M Galkin, VP Dwivedi… - Advances in …, 2022 - proceedings.neurips.cc
We propose a recipe on how to build a general, powerful, scalable (GPS) graph Transformer
with linear complexity and state-of-the-art results on a diverse set of benchmarks. Graph …

Vision gnn: An image is worth graph of nodes

K Han, Y Wang, J Guo, Y Tang… - Advances in neural …, 2022 - proceedings.neurips.cc
Network architecture plays a key role in the deep learning-based computer vision system.
The widely-used convolutional neural network and transformer treat the image as a grid or …

SNR-aware low-light image enhancement

X Xu, R Wang, CW Fu, J Jia - Proceedings of the IEEE/CVF …, 2022 - openaccess.thecvf.com
This paper presents a new solution for low-light image enhancement by collectively
exploiting Signal-to-Noise-Ratio-aware transformers and convolutional models to …

Transformers in time series: A survey

Q Wen, T Zhou, C Zhang, W Chen, Z Ma, J Yan… - arXiv preprint arXiv …, 2022 - arxiv.org
Transformers have achieved superior performances in many tasks in natural language
processing and computer vision, which also triggered great interest in the time series …

An effective CNN and Transformer complementary network for medical image segmentation

F Yuan, Z Zhang, Z Fang - Pattern Recognition, 2023 - Elsevier
The Transformer network was originally proposed for natural language processing. Due to
its powerful representation ability for long-range dependency, it has been extended for …

Spike-driven transformer

M Yao, J Hu, Z Zhou, L Yuan, Y Tian… - Advances in neural …, 2024 - proceedings.neurips.cc
Abstract Spiking Neural Networks (SNNs) provide an energy-efficient deep learning option
due to their unique spike-based event-driven (ie, spike-driven) paradigm. In this paper, we …