Approximate Bayesian inference with the weighted likelihood bootstrap

MA Newton, AE Raftery - Journal of the Royal Statistical Society …, 1994 - academic.oup.com
We introduce the weighted likelihood bootstrap (WLB) as a way to simulate approximately
from a posterior distribution. This method is often easy to implement, requiring only an …

Empowering the 6G cellular architecture with Open RAN

M Polese, M Dohler, F Dressler… - IEEE Journal on …, 2023 - ieeexplore.ieee.org
Innovation and standardization in 5G have brought advancements to every facet of the
cellular architecture. This ranges from the introduction of new frequency bands and …

Better algorithms for stochastic bandits with adversarial corruptions

A Gupta, T Koren, K Talwar - Conference on Learning …, 2019 - proceedings.mlr.press
We study the stochastic multi-armed bandits problem in the presence of adversarial
corruption. We present a new algorithm for this problem whose regret is nearly optimal …

Adaptive reward-poisoning attacks against reinforcement learning

X Zhang, Y Ma, A Singla, X Zhu - … Conference on Machine …, 2020 - proceedings.mlr.press
In reward-poisoning attacks against reinforcement learning (RL), an attacker can perturb the
environment reward $ r_t $ into $ r_t+\delta_t $ at each step, with the goal of forcing the RL …

Universal off-policy evaluation

Y Chandak, S Niekum, B da Silva… - Advances in …, 2021 - proceedings.neurips.cc
When faced with sequential decision-making problems, it is often useful to be able to predict
what would happen if decisions were made using a new policy. Those predictions must …

Reward poisoning in reinforcement learning: Attacks against unknown learners in unknown environments

A Rakhsha, X Zhang, X Zhu, A Singla - arXiv preprint arXiv:2102.08492, 2021 - arxiv.org
We study black-box reward poisoning attacks against reinforcement learning (RL), in which
an adversary aims to manipulate the rewards to mislead a sequence of RL agents with …

One more step towards reality: Cooperative bandits with imperfect communication

U Madhushani, A Dubey, N Leonard… - Advances in Neural …, 2021 - proceedings.neurips.cc
The cooperative bandit problem is increasingly becoming relevant due to its applications in
large-scale decision-making. However, most research for this problem focuses exclusively …

Conformal off-policy prediction in contextual bandits

MF Taufiq, JF Ton, R Cornish… - Advances in Neural …, 2022 - proceedings.neurips.cc
Most off-policy evaluation methods for contextual bandits have focused on the expected
outcome of a policy, which is estimated via methods that at best provide only asymptotic …

Online and distribution-free robustness: Regression and contextual bandits with huber contamination

S Chen, F Koehler, A Moitra… - 2021 IEEE 62nd Annual …, 2022 - ieeexplore.ieee.org
In this work we revisit two classic high-dimensional online learning problems, namely linear
regression and contextual bandits, from the perspective of adversarial robustness. Existing …

Best of both worlds: Stochastic & adversarial best-arm identification

Y Abbasi-Yadkori, P Bartlett… - … on learning theory, 2018 - proceedings.mlr.press
We study bandit best-arm identification with arbitrary and potentially adversarial rewards. A
simple random uniform learner obtains the optimal rate of error in the adversarial scenario …