General-purpose in-context learning by meta-learning transformers

A Vettoruzzo, MR Bouguelia… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Meta-learning empowers learning systems with the ability to acquire knowledge from
multiple tasks, enabling faster adaptation and generalization to new tasks. This review …

被引用次数：35 相关文章所有 7 个版本

[HTML] aip.org

[HTML][HTML] Brain-inspired learning in artificial neural networks: a review

S Schmidgall, R Ziaei, J Achterberg, L Kirsch… - APL Machine …, 2024 - pubs.aip.org

Artificial neural networks (ANNs) have emerged as an essential tool in machine learning,
achieving remarkable success across diverse domains, including image and speech …

被引用次数：36 相关文章所有 4 个版本

[PDF] mlr.press

Transformers learn in-context by gradient descent

J Von Oswald, E Niklasson… - International …, 2023 - proceedings.mlr.press

At present, the mechanisms of in-context learning in Transformers are not well understood
and remain mostly an intuition. In this paper, we suggest that training Transformers on auto …

被引用次数：303 相关文章所有 9 个版本

[PDF] arxiv.org

Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers

D Dai, Y Sun, L Dong, Y Hao, S Ma, Z Sui… - arXiv preprint arXiv …, 2022 - arxiv.org

Large pretrained language models have shown surprising in-context learning (ICL) ability.
With a few demonstration input-label pairs, they can predict the label for an unseen input …

被引用次数：273 相关文章所有 4 个版本

[PDF] neurips.cc

Transformers as statisticians: Provable in-context learning with in-context algorithm selection

Y Bai, F Chen, H Wang, C Xiong… - Advances in neural …, 2024 - proceedings.neurips.cc

Neural sequence models based on the transformer architecture have demonstrated
remarkable\emph {in-context learning}(ICL) abilities, where they can perform new tasks …

被引用次数：112 相关文章所有 7 个版本

[PDF] mlr.press

Transformers as algorithms: Generalization and stability in in-context learning

Y Li, ME Ildiz, D Papailiopoulos… - … on Machine Learning, 2023 - proceedings.mlr.press

In-context learning (ICL) is a type of prompting where a transformer model operates on a
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …

被引用次数：93 相关文章所有 8 个版本

[PDF] arxiv.org

Large language models as general pattern machines

S Mirchandani, F Xia, P Florence, B Ichter… - arXiv preprint arXiv …, 2023 - arxiv.org

We observe that pre-trained large language models (LLMs) are capable of autoregressively
completing complex token sequences--from arbitrary ones procedurally generated by …

被引用次数：116 相关文章所有 4 个版本

[PDF] neurips.cc

Supervised pretraining can learn in-context reinforcement learning

J Lee, A Xie, A Pacchiano, Y Chandak… - Advances in …, 2024 - proceedings.neurips.cc

Large transformer models trained on diverse datasets have shown a remarkable ability to
learn in-context, achieving high few-shot performance on tasks they were not explicitly …

被引用次数：34 相关文章所有 7 个版本

[PDF] neurips.cc

Structured state space models for in-context reinforcement learning

C Lu, Y Schroecker, A Gu, E Parisotto… - Advances in …, 2024 - proceedings.neurips.cc

Structured state space sequence (S4) models have recently achieved state-of-the-art
performance on long-range sequence modeling tasks. These models also have fast …

被引用次数：50 相关文章所有 6 个版本

Long-range transformers for dynamic spatiotemporal forecasting

J Grigsby, Z Wang, N Nguyen, Y Qi - arXiv preprint arXiv:2109.12218, 2021 - arxiv.org

Multivariate time series forecasting focuses on predicting future values based on historical
context. State-of-the-art sequence-to-sequence models rely on neural attention between …

被引用次数：107 相关文章所有 2 个版本