Epistemic neural networks
Intelligence relies on an agent's knowledge of what it does not know. This capability can be
assessed based on the quality of joint predictions of labels across multiple inputs. In …
assessed based on the quality of joint predictions of labels across multiple inputs. In …
Reinforcement learning, bit by bit
Reinforcement learning agents have demonstrated remarkable achievements in simulated
environments. Data efficiency poses an impediment to carrying this success over to real …
environments. Data efficiency poses an impediment to carrying this success over to real …
Posterior meta-replay for continual learning
C Henning, M Cervera, F D'Angelo… - Advances in neural …, 2021 - proceedings.neurips.cc
Learning a sequence of tasks without access to iid observations is a widely studied form of
continual learning (CL) that remains challenging. In principle, Bayesian learning directly …
continual learning (CL) that remains challenging. In principle, Bayesian learning directly …
Scalable neural contextual bandit for recommender systems
High-quality recommender systems ought to deliver both innovative and relevant content
through effective and exploratory interactions with users. Yet, supervised learning-based …
through effective and exploratory interactions with users. Yet, supervised learning-based …
Controllable pareto multi-task learning
A multi-task learning (MTL) system aims at solving multiple related tasks at the same time.
With a fixed model capacity, the tasks would be conflicted with each other, and the system …
With a fixed model capacity, the tasks would be conflicted with each other, and the system …
Approximate thompson sampling via epistemic neural networks
Thompson sampling (TS) is a popular heuristic for action selection, but it requires sampling
from a posterior distribution. Unfortunately, this can become computationally intractable in …
from a posterior distribution. Unfortunately, this can become computationally intractable in …
Efficient exploration for llms
We present evidence of substantial benefit from efficient exploration in gathering human
feedback to improve large language models. In our experiments, an agent sequentially …
feedback to improve large language models. In our experiments, an agent sequentially …
Visual affordance prediction for guiding robot exploration
H Bharadhwaj, A Gupta… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Motivated by the intuitive understanding humans have about the space of possible
interactions, and the ease with which they can generalize this understanding to previously …
interactions, and the ease with which they can generalize this understanding to previously …
Uncertainty estimation for language reward models
Language models can learn a range of capabilities from unsupervised training on text
corpora. However, to solve a particular problem (such as text summarization) it is typically …
corpora. However, to solve a particular problem (such as text summarization) it is typically …
Meta-learning via hypernetworks
D Zhao, S Kobayashi… - 4th Workshop on …, 2020 - research-collection.ethz.ch
Recent developments in few-shot learning have shown that during fast adaption, gradient-
based meta-learners mostly rely on embedding features of powerful pretrained networks …
based meta-learners mostly rely on embedding features of powerful pretrained networks …