Supervised pretraining can learn in-context reinforcement learning
Large transformer models trained on diverse datasets have shown a remarkable ability to
learn in-context, achieving high few-shot performance on tasks they were not explicitly …
learn in-context, achieving high few-shot performance on tasks they were not explicitly …
Epistemic neural networks
Intelligence relies on an agent's knowledge of what it does not know. This capability can be
assessed based on the quality of joint predictions of labels across multiple inputs. In …
assessed based on the quality of joint predictions of labels across multiple inputs. In …
Self-exploring language models: Active preference elicitation for online alignment
Preference optimization, particularly through Reinforcement Learning from Human
Feedback (RLHF), has achieved significant success in aligning Large Language Models …
Feedback (RLHF), has achieved significant success in aligning Large Language Models …
Making rl with preference-based feedback efficient via randomization
Reinforcement Learning algorithms that learn from human feedback (RLHF) need to be
efficient in terms of statistical complexity, computational complexity, and query complexity. In …
efficient in terms of statistical complexity, computational complexity, and query complexity. In …
Efficient exploration for llms
We present evidence of substantial benefit from efficient exploration in gathering human
feedback to improve large language models. In our experiments, an agent sequentially …
feedback to improve large language models. In our experiments, an agent sequentially …
Position paper: Bayesian deep learning in the age of large-scale ai
In the current landscape of deep learning research, there is a predominant emphasis on
achieving high predictive accuracy in supervised tasks involving large image and language …
achieving high predictive accuracy in supervised tasks involving large image and language …
Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI
In the current landscape of deep learning research, there is a predominant emphasis on
achieving high predictive accuracy in supervised tasks involving large image and language …
achieving high predictive accuracy in supervised tasks involving large image and language …
Reinforcement Learning: An Overview
K Murphy - arXiv preprint arXiv:2412.05265, 2024 - arxiv.org
This manuscript gives a big-picture, up-to-date overview of the field of (deep) reinforcement
learning and sequential decision making, covering value-based RL, policy-gradient …
learning and sequential decision making, covering value-based RL, policy-gradient …
Pearl: A Production-ready Reinforcement Learning Agent
Reinforcement learning (RL) is a versatile framework for optimizing long-term goals.
Although many real-world problems can be formalized with RL, learning and deploying a …
Although many real-world problems can be formalized with RL, learning and deploying a …
Satisficing exploration for deep reinforcement learning
A default assumption in the design of reinforcement-learning algorithms is that a decision-
making agent always explores to learn optimal behavior. In sufficiently complex …
making agent always explores to learn optimal behavior. In sufficiently complex …