Preference-based online learning with dueling bandits: A survey
In machine learning, the notion of multi-armed bandits refers to a class of online learning
problems, in which an agent is supposed to simultaneously explore and exploit a given set …
problems, in which an agent is supposed to simultaneously explore and exploit a given set …
Optimal thompson sampling strategies for support-aware cvar bandits
In this paper we study a multi-arm bandit problem in which the quality of each arm is
measured by the Conditional Value at Risk (CVaR) at some level alpha of the reward …
measured by the Conditional Value at Risk (CVaR) at some level alpha of the reward …
Sequential estimation of quantiles with applications to A/B testing and best-arm identification
Sequential estimation of quantiles with applications to A/B testing and best-arm identification
Page 1 Bernoulli 28(3), 2022, 1704–1728 https://doi.org/10.3150/21-BEJ1388 Sequential …
Page 1 Bernoulli 28(3), 2022, 1704–1728 https://doi.org/10.3150/21-BEJ1388 Sequential …
Risk verification of stochastic systems with neural network controllers
Motivated by the fragility of neural network (NN) controllers in safety-critical applications, we
present a data-driven framework for verifying the risk of stochastic dynamical systems with …
present a data-driven framework for verifying the risk of stochastic dynamical systems with …
STL robustness risk over discrete-time stochastic processes
L Lindemann, N Matni… - 2021 60th IEEE …, 2021 - ieeexplore.ieee.org
We present a framework to interpret signal temporal logic (STL) formulas over discrete-time
stochastic processes in terms of the induced risk. Each realization of a stochastic process …
stochastic processes in terms of the induced risk. Each realization of a stochastic process …
Quantile context-aware social IoT service big data recommendation with D2D communication
With the rapid development of the Internet-of-Things (IoT) networks, millions of IoT services
provided through wireless networks are waiting for people's exploration. Such a large …
provided through wireless networks are waiting for people's exploration. Such a large …
Distribution-free model-agnostic regression calibration via nonparametric methods
In this paper, we consider the uncertainty quantification problem for regression models.
Specifically, we consider an individual calibration objective for characterizing the quantiles …
Specifically, we consider an individual calibration objective for characterizing the quantiles …
Risk of stochastic systems for temporal logic specifications
The wide availability of data coupled with the computational advances in artificial
intelligence and machine learning promise to enable many future technologies such as …
intelligence and machine learning promise to enable many future technologies such as …
Rapid regression detection in software deployments through sequential testing
The practice of continuous deployment has enabled companies to reduce time-to-market by
increasing the rate at which software can be deployed. However, deploying more frequently …
increasing the rate at which software can be deployed. However, deploying more frequently …
Quantile bandits for best arms identification
We consider a variant of the best arm identification task in stochastic multi-armed bandits.
Motivated by risk-averse decision-making problems, our goal is to identify a set of $ m …
Motivated by risk-averse decision-making problems, our goal is to identify a set of $ m …