Omninet: Omnidirectional representations from transformers

C Zhu, W Ping, C Xiao, M Shoeybi… - Advances in neural …, 2021 - proceedings.neurips.cc

Transformers have achieved success in both language and vision domains. However, it is
prohibitively expensive to scale them to long sequences such as long documents or high …

被引用次数：129 相关文章所有 8 个版本

[PDF] arxiv.org

Exploring the limits of large scale pre-training

S Abnar, M Dehghani, B Neyshabur… - arXiv preprint arXiv …, 2021 - arxiv.org

Recent developments in large-scale machine learning suggest that by scaling up data,
model size and training time properly, one might observe that improvements in pre-training …

被引用次数：123 相关文章所有 4 个版本

[PDF] arxiv.org

The efficiency misnomer

M Dehghani, A Arnab, L Beyer, A Vaswani… - arXiv preprint arXiv …, 2021 - arxiv.org

Model efficiency is a critical aspect of developing and deploying machine learning models.
Inference time and latency directly affect the user experience, and some applications have …

被引用次数：102 相关文章所有 3 个版本

[PDF] arxiv.org

KVT: k-NN Attention for Boosting Vision Transformers

P Wang, X Wang, F Wang, M Lin, S Chang, H Li… - European conference on …, 2022 - Springer

Abstract Convolutional Neural Networks (CNNs) have dominated computer vision for years,
due to its ability in capturing locality and translation invariance. Recently, many vision …

被引用次数：103 相关文章所有 6 个版本

[PDF] thecvf.com

Scenic: A jax library for computer vision research and beyond

M Dehghani, A Gritsenko, A Arnab… - Proceedings of the …, 2022 - openaccess.thecvf.com

Scenic is an open-source (https://github. com/google-research/scenic) JAX library with a
focus on transformer-based models for computer vision research and beyond. The goal of …

被引用次数：75 相关文章所有 6 个版本

[HTML] cyberleninka.ru

[HTML][HTML] Diagnosis of schizophrenia based on the data of various modalities: biomarkers and machine learning techniques

MG Sharaev, IK Malashenkova… - Современные …, 2022 - cyberleninka.ru

Schizophrenia is a socially significant mental disorder resulting frequently in severe forms of
disability. Diagnosis, choice of treatment tactics, and rehabilitation in clinical psychiatry are …

被引用次数：8 相关文章所有 6 个版本

[PDF] thecvf.com

FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer

CC Chang, YY Sung, S Yu, NC Huang… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Vision Transformers (ViT) have recently demonstrated success across a myriad of
computer vision tasks. However, their elevated computational demands pose significant …

被引用次数：5 相关文章所有 5 个版本

[PDF] mlr.press

Learning a fourier transform for linear relative positional encodings in transformers

K Choromanski, S Li, V Likhosherstov… - International …, 2024 - proceedings.mlr.press

We propose a new class of linear Transformers called FourierLearner-Transformers (FLTs),
which incorporate a wide range of relative positional encoding mechanisms (RPEs). These …

被引用次数：6 相关文章所有 2 个版本

[PDF] nsf.gov

Softmax bottleneck makes language models unable to represent multi-mode word distributions

HS Chang, A McCallum - Proceedings of the 60th Annual Meeting of the …, 2022 - par.nsf.gov

Neural language models (LMs) such as GPT-2 estimate the probability distribution over the
next word by a softmax over the vocabulary. The softmax layer produces the distribution …

被引用次数：16 相关文章所有 4 个版本

[PDF] researchgate.net

Quickskill: Novice skill estimation in online multiplayer games

C Zhang, K Wang, H Chen, G Fan, Y Li, L Wu… - Proceedings of the 31st …, 2022 - dl.acm.org

Matchmaking systems are vital for creating fair matches in online multiplayer games, which
directly affects players' satisfactions and game experience. Most of the matchmaking …

被引用次数：11 相关文章所有 4 个版本