Convergence of entropy-regularized natural policy gradient with linear function approximation
Natural policy gradient (NPG) methods, equipped with function approximation and entropy
regularization, achieve impressive empirical success in reinforcement learning problems …
regularization, achieve impressive empirical success in reinforcement learning problems …
Finite-time analysis of entropy-regularized neural natural actor-critic algorithm
Natural actor-critic (NAC) and its variants, equipped with the representation power of neural
networks, have demonstrated impressive empirical success in solving Markov decision …
networks, have demonstrated impressive empirical success in solving Markov decision …
Sample complexity and overparameterization bounds for temporal-difference learning with neural network approximation
In this article, we study the dynamics of temporal-difference (TD) learning with neural
network-based value function approximation over a general state space, namely, neural TD …
network-based value function approximation over a general state space, namely, neural TD …
Investigating overparameterization for non-negative matrix factorization in collaborative filtering
Y Kawakami, M Sugiyama - Proceedings of the 15th ACM Conference …, 2021 - dl.acm.org
Overparameterization is one of the key techniques in modern machine learning, where a
model with the higher complexity can generalize better on test data against the common …
model with the higher complexity can generalize better on test data against the common …
TOPS: Transition-Based Volatility-Reduced Policy Search
X Liangliang, L Daoming, P Yangchen - Lecture notes in computer …, 2022 - par.nsf.gov
Existing risk-averse reinforcement learning approaches still face several challenges,
including the lack of global optimality guarantee and the necessity of learning from long-term …
including the lack of global optimality guarantee and the necessity of learning from long-term …
Explicit Regularization for Overparameterized Models
TY Huang - 2023 - dspace.mit.edu
In many learning problems, it is desirable to incorporate explicit regularization in the
objective to avoid overfitting the data. Typically, the regularized objective is solved via …
objective to avoid overfitting the data. Typically, the regularized objective is solved via …
TOPS: Transition-Based Volatility-Reduced Policy Search
Existing risk-averse reinforcement learning approaches still face several challenges,
including the lack of global optimality guarantee and the necessity of learning from long-term …
including the lack of global optimality guarantee and the necessity of learning from long-term …
Dynamical systems perspectives in machine learning
S Satpathi - 2021 - ideals.illinois.edu
We look at two facets of machine learning from a perspective of dynamical systems, that is,
the data generated from a dynamical system and the iterative inference algorithm posed as …
the data generated from a dynamical system and the iterative inference algorithm posed as …