Differentiable plasticity: training plastic neural networks with backpropagation

KO Stanley, J Clune, J Lehman… - Nature Machine …, 2019 - nature.com

Much of recent machine learning has focused on deep learning, in which neural network
weights are trained through variants of stochastic gradient descent. An alternative approach …

被引用次数：721 相关文章所有 6 个版本

[PDF] cell.com Full View

Artificial neural networks for neuroscientists: a primer

GR Yang, XJ Wang - Neuron, 2020 - cell.com

Artificial neural networks (ANNs) are essential tools in machine learning that have drawn
increasing attention in neuroscience. Besides offering powerful techniques for data analysis …

被引用次数：277 相关文章所有 15 个版本

[PDF] ieee.org

Meta-learning in neural networks: A survey

T Hospedales, A Antoniou, P Micaelli… - IEEE transactions on …, 2021 - ieeexplore.ieee.org

The field of meta-learning, or learning-to-learn, has seen a dramatic rise in interest in recent
years. Contrary to conventional approaches to AI where tasks are solved from scratch using …

被引用次数：2026 相关文章所有 10 个版本

[PDF] springer.com

A survey of deep meta-learning

M Huisman, JN Van Rijn, A Plaat - Artificial Intelligence Review, 2021 - Springer

Deep neural networks can achieve great successes when presented with large data sets
and sufficient computational resources. However, their ability to learn new concepts quickly …

被引用次数：334 相关文章所有 14 个版本

[PDF] aaai.org

Frozen pretrained transformers as universal computation engines

K Lu, A Grover, P Abbeel, I Mordatch - Proceedings of the AAAI …, 2022 - ojs.aaai.org

We investigate the capability of a transformer pretrained on natural language to generalize
to other modalities with minimal finetuning--in particular, without finetuning of the self …

被引用次数：271 相关文章所有 11 个版本

[PDF] arxiv.org

A survey of meta-reinforcement learning

J Beck, R Vuorio, EZ Liu, Z Xiong, L Zintgraf… - arXiv preprint arXiv …, 2023 - arxiv.org

While deep reinforcement learning (RL) has fueled multiple high-profile successes in
machine learning, it is held back from more widespread adoption by its often poor data …

被引用次数：105 相关文章所有 2 个版本

[PDF] arxiv.org

Random feature attention

H Peng, N Pappas, D Yogatama, R Schwartz… - arXiv preprint arXiv …, 2021 - arxiv.org

Transformers are state-of-the-art models for a variety of sequence modeling tasks. At their
core is an attention function which models pairwise interactions between the inputs at every …

被引用次数：309 相关文章所有 6 个版本

[PDF] mlr.press

Linear transformers are secretly fast weight programmers

I Schlag, K Irie, J Schmidhuber - International Conference on …, 2021 - proceedings.mlr.press

We show the formal equivalence of linearised self-attention mechanisms and fast weight
controllers from the early'90s, where a slow neural net learns by gradient descent to …

被引用次数：167 相关文章所有 4 个版本

[PDF] mlr.press

Stabilizing transformers for reinforcement learning

E Parisotto, F Song, J Rae, R Pascanu… - International …, 2020 - proceedings.mlr.press

Owing to their ability to both effectively integrate information over long time horizons and
scale to massive amounts of data, self-attention architectures have recently shown …

被引用次数：356 相关文章所有 6 个版本

[PDF] arxiv.org

Meta-learning with warped gradient descent

S Flennerhag, AA Rusu, R Pascanu, F Visin… - arXiv preprint arXiv …, 2019 - arxiv.org

Learning an efficient update rule from data that promotes rapid learning of new tasks from
the same distribution remains an open problem in meta-learning. Typically, previous works …

被引用次数：239 相关文章所有 6 个版本