Online Policy Learning and Inference by Matrix Completion

C Duan, J Li, D Xia - arXiv preprint arXiv:2404.17398, 2024 - arxiv.org
Making online decisions can be challenging when features are sparse and orthogonal to
historical ones, especially when the optimal policy is learned through collaborative filtering …

Online Learning and Resource Allocation: Algorithms under Non-stationarity

Y Wang, W You, J Jiang - No. This is a working paper, 2024 - papers.ssrn.com
We consider an online stochastic optimization problem with multiple resource constraints
over a finite horizon. In each time period, the decision maker selects an action from a convex …