Convergence of entropy-regularized natural policy gradient with linear function approximation

S Cayci, N He, R Srikant - SIAM Journal on Optimization, 2024 - SIAM
Natural policy gradient (NPG) methods, equipped with function approximation and entropy
regularization, achieve impressive empirical success in reinforcement learning problems …

Finite-time analysis of entropy-regularized neural natural actor-critic algorithm

S Cayci, N He, R Srikant - arXiv preprint arXiv:2206.00833, 2022 - arxiv.org
Natural actor-critic (NAC) and its variants, equipped with the representation power of neural
networks, have demonstrated impressive empirical success in solving Markov decision …

Sample complexity and overparameterization bounds for temporal-difference learning with neural network approximation

S Cayci, S Satpathi, N He… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
In this article, we study the dynamics of temporal-difference (TD) learning with neural
network-based value function approximation over a general state space, namely, neural TD …

Investigating overparameterization for non-negative matrix factorization in collaborative filtering

Y Kawakami, M Sugiyama - Proceedings of the 15th ACM Conference …, 2021 - dl.acm.org
Overparameterization is one of the key techniques in modern machine learning, where a
model with the higher complexity can generalize better on test data against the common …

TOPS: Transition-Based Volatility-Reduced Policy Search

X Liangliang, L Daoming, P Yangchen - Lecture notes in computer …, 2022 - par.nsf.gov
Existing risk-averse reinforcement learning approaches still face several challenges,
including the lack of global optimality guarantee and the necessity of learning from long-term …

Explicit Regularization for Overparameterized Models

TY Huang - 2023 - dspace.mit.edu
In many learning problems, it is desirable to incorporate explicit regularization in the
objective to avoid overfitting the data. Typically, the regularized objective is solved via …

TOPS: Transition-Based Volatility-Reduced Policy Search

L Xu, D Lyu, Y Pan, A Jiang, B Liu - International Conference on …, 2022 - Springer
Existing risk-averse reinforcement learning approaches still face several challenges,
including the lack of global optimality guarantee and the necessity of learning from long-term …

Dynamical systems perspectives in machine learning

S Satpathi - 2021 - ideals.illinois.edu
We look at two facets of machine learning from a perspective of dynamical systems, that is,
the data generated from a dynamical system and the iterative inference algorithm posed as …