Advances and challenges in meta-learning: A technical review
A Vettoruzzo, MR Bouguelia… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Meta-learning empowers learning systems with the ability to acquire knowledge from
multiple tasks, enabling faster adaptation and generalization to new tasks. This review …
multiple tasks, enabling faster adaptation and generalization to new tasks. This review …
[HTML][HTML] Brain-inspired learning in artificial neural networks: a review
Artificial neural networks (ANNs) have emerged as an essential tool in machine learning,
achieving remarkable success across diverse domains, including image and speech …
achieving remarkable success across diverse domains, including image and speech …
Transformers learn in-context by gradient descent
J Von Oswald, E Niklasson… - International …, 2023 - proceedings.mlr.press
At present, the mechanisms of in-context learning in Transformers are not well understood
and remain mostly an intuition. In this paper, we suggest that training Transformers on auto …
and remain mostly an intuition. In this paper, we suggest that training Transformers on auto …
Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers
Large pretrained language models have shown surprising in-context learning (ICL) ability.
With a few demonstration input-label pairs, they can predict the label for an unseen input …
With a few demonstration input-label pairs, they can predict the label for an unseen input …
Transformers as statisticians: Provable in-context learning with in-context algorithm selection
Neural sequence models based on the transformer architecture have demonstrated
remarkable\emph {in-context learning}(ICL) abilities, where they can perform new tasks …
remarkable\emph {in-context learning}(ICL) abilities, where they can perform new tasks …
Transformers as algorithms: Generalization and stability in in-context learning
In-context learning (ICL) is a type of prompting where a transformer model operates on a
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …
Large language models as general pattern machines
We observe that pre-trained large language models (LLMs) are capable of autoregressively
completing complex token sequences--from arbitrary ones procedurally generated by …
completing complex token sequences--from arbitrary ones procedurally generated by …
Supervised pretraining can learn in-context reinforcement learning
Large transformer models trained on diverse datasets have shown a remarkable ability to
learn in-context, achieving high few-shot performance on tasks they were not explicitly …
learn in-context, achieving high few-shot performance on tasks they were not explicitly …
Structured state space models for in-context reinforcement learning
Structured state space sequence (S4) models have recently achieved state-of-the-art
performance on long-range sequence modeling tasks. These models also have fast …
performance on long-range sequence modeling tasks. These models also have fast …
Long-range transformers for dynamic spatiotemporal forecasting
Multivariate time series forecasting focuses on predicting future values based on historical
context. State-of-the-art sequence-to-sequence models rely on neural attention between …
context. State-of-the-art sequence-to-sequence models rely on neural attention between …