Online learning: A comprehensive survey

SCH Hoi, D Sahoo, J Lu, P Zhao - Neurocomputing, 2021 - Elsevier
Online learning represents a family of machine learning methods, where a learner attempts
to tackle some predictive (or any type of decision-making) task by learning from a sequence …

[图书][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

A modern introduction to online learning

F Orabona - arXiv preprint arXiv:1912.13213, 2019 - arxiv.org
In this monograph, I introduce the basic concepts of Online Learning through a modern view
of Online Convex Optimization. Here, online learning refers to the framework of regret …

Introduction to online convex optimization

E Hazan - Foundations and Trends® in Optimization, 2016 - nowpublishers.com
This monograph portrays optimization as a process. In many practical applications the
environment is so complex that it is infeasible to lay out a comprehensive theoretical model …

Regret analysis of stochastic and nonstochastic multi-armed bandit problems

S Bubeck, N Cesa-Bianchi - Foundations and Trends® in …, 2012 - nowpublishers.com
Multi-armed bandit problems are the most basic examples of sequential decision problems
with an exploration-exploitation trade-off. This is the balance between staying with the option …

Online learning and online convex optimization

S Shalev-Shwartz - Foundations and Trends® in Machine …, 2012 - nowpublishers.com
Online learning is a well established learning paradigm which has both theoretical and
practical appeals. The goal of online learning is to make a sequence of accurate predictions …

Online learning algorithms

N Cesa-Bianchi, F Orabona - Annual review of statistics and its …, 2021 - annualreviews.org
Online learning is a framework for the design and analysis of algorithms that build predictive
models by processing data one at the time. Besides being computationally efficient, online …

[PDF][PDF] Adaptive subgradient methods for online learning and stochastic optimization.

J Duchi, E Hazan, Y Singer - Journal of machine learning research, 2011 - jmlr.org
We present a new family of subgradient methods that dynamically incorporate knowledge of
the geometry of the data observed in earlier iterations to perform more informative gradient …

[图书][B] Optimization for machine learning

S Sra, S Nowozin, SJ Wright - 2011 - books.google.com
An up-to-date account of the interplay between optimization and machine learning,
accessible to students and researchers in both communities. The interplay between …

Learning adversarial markov decision processes with bandit feedback and unknown transition

C Jin, T Jin, H Luo, S Sra, T Yu - International Conference on …, 2020 - proceedings.mlr.press
We consider the task of learning in episodic finite-horizon Markov decision processes with
an unknown transition function, bandit feedback, and adversarial losses. We propose an …