The policy graph decomposition of multistage stochastic programming problems

O Dowson - Networks, 2020 - Wiley Online Library
We propose the policy graph as a structured way of formulating a general class of multistage
stochastic programming problems in a way that leads to a natural decomposition. We also …

Generalized benders decomposition with continual learning for hybrid model predictive control in dynamic environment

X Lin - arXiv preprint arXiv:2310.03344, 2023 - arxiv.org
Hybrid model predictive control (MPC) with both continuous and discrete variables is widely
applicable to robotic control tasks, especially those involving contact with the environment …

Periodical multistage stochastic programs

A Shapiro, L Ding - SIAM Journal on Optimization, 2020 - SIAM
In some applications the considered multistage stochastic programs have a periodical
behavior. We show that in such cases it is possible to drastically reduce the number of …

Dynamic inventory repositioning in on-demand rental networks

S Benjaafar, D Jiang, X Li, X Li - Management Science, 2022 - pubsonline.informs.org
We consider a rental service with a fixed number of rental units distributed across multiple
locations. The units are accessed by customers without prior reservation and on an on …

Fast and Continual Learning for Hybrid Control Policies using Generalized Benders Decomposition

X Lin - arXiv preprint arXiv:2401.00917, 2024 - arxiv.org
Hybrid model predictive control with both continuous and discrete variables is widely
applicable to robotic control tasks, especially those involving contact with the environment …

Accelerate Hybrid Model Predictive Control using Generalized Benders Decomposition

X Lin - arXiv preprint arXiv:2406.00780, 2024 - arxiv.org
Hybrid model predictive control with both continuous and discrete variables is widely
applicable to robotics tasks. Due to the combinatorial complexity, the solving speed of hybrid …

Learning continuous -functions using generalized Benders cuts

J Warrington - 2019 18th European Control Conference (ECC), 2019 - ieeexplore.ieee.org
Q-functions are widely used in discrete-time learning and control to model future costs
arising from a given control policy, when the initial state and input are given. Although some …

Gradient-bounded dynamic programming for submodular and concave extensible value functions with probabilistic performance guarantees

D Lebedev, P Goulart, K Margellos - Automatica, 2022 - Elsevier
We consider stochastic dynamic programming problems with high-dimensional, discrete
state-spaces and finite, discrete-time horizons that prohibit direct computation of the value …

A Convex Programming Approach to Data-Driven Risk-Averse Reinforcement Learning

Y Han, M Mazouchi, S Nageshrao… - arXiv preprint arXiv …, 2021 - arxiv.org
This paper presents a model-free reinforcement learning (RL) algorithm to solve the risk-
averse optimal control (RAOC) problem for discrete-time nonlinear systems. While …

Gradient-bounded dynamic programming with submodular and concave extensible value functions

D Lebedev, P Goulart, K Margellos - IFAC-PapersOnLine, 2020 - Elsevier
We consider dynamic programming problems with finite, discrete-time horizons and
prohibitively high-dimensional, discrete state-spaces for direct computation of the value …