Simple recurrent units for highly parallelizable recurrence

L Deng, G Li, S Han, L Shi, Y Xie - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org

Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

被引用次数：856 相关文章所有 2 个版本

[HTML] mit.edu

A survey on deep learning for multimodal data fusion

J Gao, P Li, Z Chen, J Zhang - Neural Computation, 2020 - direct.mit.edu

With the wide deployments of heterogeneous networks, huge amounts of data with
characteristics of high volume, high variety, high velocity, and high veracity are generated …

被引用次数：523 相关文章所有 6 个版本

[PDF] arxiv.org

Mamba: Linear-time sequence modeling with selective state spaces

A Gu, T Dao - arXiv preprint arXiv:2312.00752, 2023 - arxiv.org

Foundation models, now powering most of the exciting applications in deep learning, are
almost universally based on the Transformer architecture and its core attention module …

被引用次数：956 相关文章所有 7 个版本

[PDF] arxiv.org

Rwkv: Reinventing rnns for the transformer era

B Peng, E Alcaide, Q Anthony, A Albalak… - arXiv preprint arXiv …, 2023 - arxiv.org

Transformers have revolutionized almost all natural language processing (NLP) tasks but
suffer from memory and computational complexity that scales quadratically with sequence …

被引用次数：283 相关文章所有 9 个版本

[HTML] nature.com

[HTML][HTML] Artificial intelligence-enabled detection and assessment of Parkinson's disease using nocturnal breathing signals

Y Yang, Y Yuan, G Zhang, H Wang, YC Chen, Y Liu… - Nature medicine, 2022 - nature.com

There are currently no effective biomarkers for diagnosing Parkinson's disease (PD) or
tracking its progression. Here, we developed an artificial intelligence (AI) model to detect PD …

被引用次数：158 相关文章所有 16 个版本

[PDF] arxiv.org

Simplified state space layers for sequence modeling

JTH Smith, A Warrington, SW Linderman - arXiv preprint arXiv:2208.04933, 2022 - arxiv.org

Models using structured state space sequence (S4) layers have achieved state-of-the-art
performance on long-range sequence modeling tasks. An S4 layer combines linear state …

被引用次数：301 相关文章所有 3 个版本

[PDF] neurips.cc

Combining recurrent, convolutional, and continuous-time models with linear state space layers

A Gu, I Johnson, K Goel, K Saab… - Advances in neural …, 2021 - proceedings.neurips.cc

Recurrent neural networks (RNNs), temporal convolutions, and neural differential equations
(NDEs) are popular families of deep learning models for time-series data, each with unique …

被引用次数：337 相关文章所有 8 个版本

[PDF] jmlr.org

Underspecification presents challenges for credibility in modern machine learning

A D'Amour, K Heller, D Moldovan, B Adlam… - Journal of Machine …, 2022 - jmlr.org

Machine learning (ML) systems often exhibit unexpectedly poor behavior when they are
deployed in real-world domains. We identify underspecification in ML pipelines as a key …

被引用次数：749 相关文章所有 10 个版本

[PDF] neurips.cc

Hierarchically gated recurrent neural network for sequence modeling

Z Qin, S Yang, Y Zhong - Advances in Neural Information …, 2024 - proceedings.neurips.cc

Transformers have surpassed RNNs in popularity due to their superior abilities in parallel
training and long-term dependency modeling. Recently, there has been a renewed interest …

被引用次数：44 相关文章所有 5 个版本

[PDF] mlr.press

Delving into deep imbalanced regression

Y Yang, K Zha, Y Chen, H Wang… - … conference on machine …, 2021 - proceedings.mlr.press

Real-world data often exhibit imbalanced distributions, where certain target values have
significantly fewer observations. Existing techniques for dealing with imbalanced data focus …

被引用次数：305 相关文章所有 7 个版本