Supervised pretraining can learn in-context reinforcement learning

J Lee, A Xie, A Pacchiano, Y Chandak… - Advances in …, 2024 - proceedings.neurips.cc
Large transformer models trained on diverse datasets have shown a remarkable ability to
learn in-context, achieving high few-shot performance on tasks they were not explicitly …

Bigger, better, faster: Human-level atari with human-level efficiency

M Schwarzer, JSO Ceron, A Courville… - International …, 2023 - proceedings.mlr.press
We introduce a value-based RL agent, which we call BBF, that achieves super-human
performance in the Atari 100K benchmark. BBF relies on scaling the neural networks used …

Harms from increasingly agentic algorithmic systems

A Chan, R Salganik, A Markelius, C Pang… - Proceedings of the …, 2023 - dl.acm.org
Research in Fairness, Accountability, Transparency, and Ethics (FATE) 1 has established
many sources and forms of algorithmic harm, in domains as diverse as health care, finance …

Foundational challenges in assuring alignment and safety of large language models

U Anwar, A Saparov, J Rando, D Paleka… - arXiv preprint arXiv …, 2024 - arxiv.org
This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

Foundation models in robotics: Applications, challenges, and the future

R Firoozi, J Tucker, S Tian, A Majumdar, J Sun… - arXiv preprint arXiv …, 2023 - arxiv.org
We survey applications of pretrained foundation models in robotics. Traditional deep
learning models in robotics are trained on small datasets tailored for specific tasks, which …

Deep reinforcement learning with plasticity injection

E Nikishin, J Oh, G Ostrovski, C Lyle… - Advances in …, 2024 - proceedings.neurips.cc
A growing body of evidence suggests that neural networks employed in deep reinforcement
learning (RL) gradually lose their plasticity, the ability to learn from new data; however, the …

A survey on transformers in reinforcement learning

W Li, H Luo, Z Lin, C Zhang, Z Lu, D Ye - arXiv preprint arXiv:2301.03044, 2023 - arxiv.org
Transformer has been considered the dominating neural architecture in NLP and CV, mostly
under supervised settings. Recently, a similar surge of using Transformers has appeared in …

Emergent agentic transformer from chain of hindsight experience

H Liu, P Abbeel - International Conference on Machine …, 2023 - proceedings.mlr.press
Large transformer models powered by diverse data and model scale have dominated
natural language modeling and computer vision and pushed the frontier of multiple AI areas …

The mechanistic basis of data dependence and abrupt learning in an in-context classification task

G Reddy - The Twelfth International Conference on Learning …, 2023 - openreview.net
Transformer models exhibit in-context learning: the ability to accurately predict the response
to a novel query based on illustrative examples in the input sequence, which contrasts with …

A survey of progress on cooperative multi-agent reinforcement learning in open environment

L Yuan, Z Zhang, L Li, C Guan, Y Yu - arXiv preprint arXiv:2312.01058, 2023 - arxiv.org
Multi-agent Reinforcement Learning (MARL) has gained wide attention in recent years and
has made progress in various fields. Specifically, cooperative MARL focuses on training a …