Spectral entry-wise matrix estimation for low-rank reinforcement learning

S Stojanovic, Y Jedra… - Advances in Neural …, 2023 - proceedings.neurips.cc
We study matrix estimation problems arising in reinforcement learning with low-rank
structure. In low-rank bandits, the matrix to be recovered specifies the expected arm …

Shapley meets uniform: An axiomatic framework for attribution in online advertising

R Singal, O Besbes, A Desir, V Goyal… - The world wide web …, 2019 - dl.acm.org
One of the central challenges in online advertising is attribution, namely, assessing the
contribution of individual advertiser actions including emails, display ads and search ads to …

CP factor model for dynamic tensors

Y Han, D Yang, CH Zhang… - Journal of the Royal …, 2024 - academic.oup.com
Observations in various applications are frequently represented as a time series of
multidimensional arrays, called tensor time series, preserving the inherent multidimensional …

Online statistical inference for matrix contextual bandit

Q Han, WW Sun, Y Zhang - arXiv preprint arXiv:2212.11385, 2022 - arxiv.org
Contextual bandit has been widely used for sequential decision-making based on the
current contextual information and historical feedback data. In modern applications, such …

Optimal high-order tensor svd via tensor-train orthogonal iteration

Y Zhou, AR Zhang, L Zheng… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
This paper studies a general framework for high-order tensor SVD. We propose a new
computationally efficient algorithm, tensor-train orthogonal iteration (TTOI), that aims to …

Nearly optimal latent state decoding in block mdps

Y Jedra, J Lee, A Proutiere… - … Conference on Artificial …, 2023 - proceedings.mlr.press
We consider the problem of model estimation in episodic Block MDPs. In these MDPs, the
decision maker has access to rich observations or contexts generated from a small number …

[HTML][HTML] Singular value distribution of dense random matrices with block Markovian dependence

J Sanders, A Van Werde - Stochastic Processes and their Applications, 2023 - Elsevier
A block Markov chain is a Markov chain whose state space can be partitioned into a finite
number of clusters such that the transition probabilities only depend on the clusters. Block …

Sparsity-Constraint Optimization via Splicing Iteration

Z Wang, J Zhu, J Zhu, B Tang, H Lin… - arXiv preprint arXiv …, 2024 - arxiv.org
Sparsity-constraint optimization has wide applicability in signal processing, statistics, and
machine learning. Existing fast algorithms must burdensomely tune parameters, such as the …

From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers

ME Ildiz, Y Huang, Y Li, AS Rawat, S Oymak - arXiv preprint arXiv …, 2024 - arxiv.org
Modern language models rely on the transformer architecture and attention mechanism to
perform language understanding and text generation. In this work, we study learning a 1 …

Speed up the cold-start learning in two-sided bandits with many arms

M Bayati, J Cao, W Chen - arXiv preprint arXiv:2210.00340, 2022 - arxiv.org
Multi-armed bandit (MAB) algorithms are efficient approaches to reduce the opportunity cost
of online experimentation and are used by companies to find the best product from …