A unified framework for stochastic optimization
WB Powell - European Journal of Operational Research, 2019 - Elsevier
Stochastic optimization is an umbrella term that includes over a dozen fragmented
communities, using a patchwork of sometimes overlapping notational systems with …
communities, using a patchwork of sometimes overlapping notational systems with …
Google vizier: A service for black-box optimization
Any sufficiently complex system acts as a black box when it becomes easier to experiment
with than to understand. Hence, black-box optimization has become increasingly important …
with than to understand. Hence, black-box optimization has become increasingly important …
Linearly parameterized bandits
P Rusmevichientong… - Mathematics of Operations …, 2010 - pubsonline.informs.org
We consider bandit problems involving a large (possibly infinite) collection of arms, in which
the expected reward of each arm is a linear function of an r-dimensional random vector Z∈ …
the expected reward of each arm is a linear function of an r-dimensional random vector Z∈ …
Dynamic assortment with demand learning for seasonal consumer goods
Companies such as Zara and World Co. have recently implemented novel product
development processes and supply chain architectures enabling them to make more …
development processes and supply chain architectures enabling them to make more …
The knowledge gradient algorithm for a general class of online learning problems
We derive a one-period look-ahead policy for finite-and infinite-horizon online optimal
learning problems with Gaussian rewards. Our approach is able to handle the case where …
learning problems with Gaussian rewards. Our approach is able to handle the case where …
A linear response bandit problem
A Goldenshluger, A Zeevi - Stochastic Systems, 2013 - pubsonline.informs.org
We consider a two–armed bandit problem which involves sequential sampling from two non-
homogeneous populations. The response in each is determined by a random covariate …
homogeneous populations. The response in each is determined by a random covariate …
Bayesian policy reuse
A long-lived autonomous agent should be able to respond online to novel instances of tasks
from a familiar domain. Acting online requires 'fast'responses, in terms of rapid convergence …
from a familiar domain. Acting online requires 'fast'responses, in terms of rapid convergence …
A structured multiarmed bandit problem and the greedy policy
AJ Mersereau, P Rusmevichientong… - IEEE Transactions on …, 2009 - ieeexplore.ieee.org
We consider a multiarmed bandit problem where the expected reward of each arm is a
linear function of an unknown scalar with a prior distribution. The objective is to choose a …
linear function of an unknown scalar with a prior distribution. The objective is to choose a …
Apollo: Transferable architecture exploration
The looming end of Moore's Law and ascending use of deep learning drives the design of
custom accelerators that are optimized for specific neural architectures. Architecture …
custom accelerators that are optimized for specific neural architectures. Architecture …
A unified framework for optimization under uncertainty
WB Powell - … challenges in complex, networked and risky …, 2016 - pubsonline.informs.org
Stochastic optimization, also known as optimization under uncertainty, is studied by over a
dozen communities, often (but not always) with different notational systems and styles …
dozen communities, often (but not always) with different notational systems and styles …