Improved regret for zeroth-order stochastic convex bandits
T Lattimore, A Gyorgy - Conference on Learning Theory, 2021 - proceedings.mlr.press
Improved Regret for Zeroth-Order Stochastic Convex Bandits Page 1 Proceedings of Machine
Learning Research vol 134:1–27, 2021 34th Annual Conference on Learning Theory Improved …
Learning Research vol 134:1–27, 2021 34th Annual Conference on Learning Theory Improved …
Bandit convex optimisation
T Lattimore - arXiv preprint arXiv:2402.06535, 2024 - arxiv.org
Bandit convex optimisation is a fundamental framework for studying zeroth-order convex
optimisation. These notes cover the many tools used for this problem, including cutting plane …
optimisation. These notes cover the many tools used for this problem, including cutting plane …
Efficient bandit convex optimization: Beyond linear losses
AS Suggala, P Ravikumar… - Conference on Learning …, 2021 - proceedings.mlr.press
We study the problem of online learning with bandit feedback, where a learner aims to
minimize a sequence of adversarially generated loss functions, while only observing the …
minimize a sequence of adversarially generated loss functions, while only observing the …
Improved Regret for Bandit Convex Optimization with Delayed Feedback
We investigate bandit convex optimization (BCO) with delayed feedback, where only the
loss value of the action is revealed under an arbitrary delay. Previous studies have …
loss value of the action is revealed under an arbitrary delay. Previous studies have …
Adaptive bandit convex optimization with heterogeneous curvature
We consider the problem of adversarial bandit convex optimization, that is, online learning
over a sequence of arbitrary convex loss functions with only one function evaluation for each …
over a sequence of arbitrary convex loss functions with only one function evaluation for each …
Provably correct sgd-based exploration for generalized stochastic bandit problem
Bandit problems have been widely used in wireless communication systems which involve
generalized reward models and may suffer high computational complexity. Despite the …
generalized reward models and may suffer high computational complexity. Despite the …
Contextual Continuum Bandits: Static Versus Dynamic Regret
We study the contextual continuum bandits problem, where the learner sequentially receives
a side information vector and has to choose an action in a convex set, minimizing a function …
a side information vector and has to choose an action in a convex set, minimizing a function …
CONGO: Compressive Online Gradient Optimization
J Carleton, P Vijaykumar, D Saxena… - arXiv preprint arXiv …, 2024 - arxiv.org
We address the challenge of zeroth-order online convex optimization where the objective
function's gradient exhibits sparsity, indicating that only a small number of dimensions …
function's gradient exhibits sparsity, indicating that only a small number of dimensions …
Projection-Free Bandit Convex Optimization over Strongly Convex Sets
Projection-free algorithms for bandit convex optimization have received increasing attention,
due to the ability to deal with the bandit feedback and complicated constraints …
due to the ability to deal with the bandit feedback and complicated constraints …
Learning Time-Varying Convexifications of Multiple Fairness Measures
Q Zhou, J Marecek, RN Shorten - openreview.net
There is an increasing appreciation that one may need to consider multiple measures of
fairness, eg, considering multiple group and individual fairness notions. The relative weights …
fairness, eg, considering multiple group and individual fairness notions. The relative weights …