Universal off-policy evaluation

Y Chandak, S Niekum, B da Silva… - Advances in …, 2021 - proceedings.neurips.cc
When faced with sequential decision-making problems, it is often useful to be able to predict
what would happen if decisions were made using a new policy. Those predictions must …

Subgaussian and differentiable importance sampling for off-policy evaluation and learning

AM Metelli, A Russo, M Restelli - Advances in neural …, 2021 - proceedings.neurips.cc
Importance Sampling (IS) is a widely used building block for a large variety of off-policy
estimation and learning algorithms. However, empirical and theoretical studies have …

Offline reinforcement learning with closed-form policy improvement operators

J Li, E Zhang, M Yin, Q Bai, YX Wang… - … on Machine Learning, 2023 - proceedings.mlr.press
Behavior constrained policy optimization has been demonstrated to be a successful
paradigm for tackling Offline Reinforcement Learning. By exploiting historical transitions, a …

Off-policy evaluation with deficient support using side information

N Felicioni, M Ferrari Dacrema… - Advances in …, 2022 - proceedings.neurips.cc
Abstract The Off-Policy Evaluation (OPE) problem consists in evaluating the performance of
new policies from the data collected by another one. OPE is crucial when evaluating a new …

Inferring smooth control: Monte carlo posterior policy iteration with gaussian processes

J Watson, J Peters - Conference on Robot Learning, 2023 - proceedings.mlr.press
Monte Carlo methods have become increasingly relevant for control of non-differentiable
systems, approximate dynamics models, and learning from data. These methods scale to …

On the relation between policy improvement and off-policy minimum-variance policy evaluation

AM Metelli, S Meta, M Restelli - Uncertainty in Artificial …, 2023 - proceedings.mlr.press
Off-policy methods are the basis of a large number of effective Policy Optimization (PO)
algorithms. In this setting, Importance Sampling (IS) is typically employed for off-policy …

[HTML][HTML] Identification of efficient sampling techniques for probabilistic voltage stability analysis of renewable-rich power systems

M Alzubaidi, KN Hasan, L Meegahapola, MT Rahman - Energies, 2021 - mdpi.com
This paper presents a comparative analysis of six sampling techniques to identify an efficient
and accurate sampling technique to be applied to probabilistic voltage stability assessment …

Training Recommenders Over Large Item Corpus With Importance Sampling

D Lian, Z Gao, X Song, Y Li, Q Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
By predicting a personalized ranking on a set of items, item recommendation helps users
determine the information they need. While optimizing a ranking-focused loss is more in line …

Lifelong hyper-policy optimization with multiple importance sampling regularization

P Liotet, F Vidaich, AM Metelli, M Restelli - Proceedings of the AAAI …, 2022 - ojs.aaai.org
Learning in a lifelong setting, where the dynamics continually evolve, is a hard challenge for
current reinforcement learning algorithms. Yet this would be a much needed feature for …

IWDA: Importance weighting for drift adaptation in streaming supervised learning problems

F Fedeli, AM Metelli, F Trovò… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Distribution drift is an important issue for practical applications of machine learning (ML). In
particular, in streaming ML, the data distribution may change over time, yielding the problem …