Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods S Bhatnagar, HL Prasad, LA Prashanth Springer 434, 302, 2013 | 419* | 2013 |
Reinforcement Learning With Function Approximation for Traffic Signal Control P LA, S Bhatnagar Intelligent Transportation Systems, IEEE Transactions on, 1-10, 2011 | 381 | 2011 |
Actor-critic algorithms for risk-sensitive MDPs P La, M Ghavamzadeh Advances in neural information processing systems 26, 2013 | 308 | 2013 |
Reinforcement learning with average cost for adaptive control of traffic lights at intersections LA Prashanth, S Bhatnagar 2011 14th International IEEE Conference on Intelligent Transportation …, 2011 | 89 | 2011 |
Cumulative prospect theory meets reinforcement learning: Prediction and control LA Prashanth, C Jie, M Fu, S Marcus, C Szepesvári International Conference on Machine Learning, 1406-1415, 2016 | 85 | 2016 |
Variance-Constrained Actor-Critic Algorithms for Discounted and Average Reward MDPs LA Prashanth, M Ghavamzadeh arXiv preprint arXiv:1403.6530, 2014 | 81 | 2014 |
Policy gradients for CVaR-constrained MDPs LA Prashanth International Conference on Algorithmic Learning Theory, 155-169, 2014 | 71 | 2014 |
Two-timescale algorithms for learning Nash equilibria in general-sum stochastic games HL Prasad, P LA, S Bhatnagar Proceedings of the 2015 International Conference on Autonomous Agents and …, 2015 | 70 | 2015 |
Concentration of risk measures: A Wasserstein distance approach SP Bhat, P LA Advances in neural information processing systems 32, 2019 | 55 | 2019 |
Concentration bounds for empirical conditional value-at-risk: The unbounded case RK Kolla, LA Prashanth, SP Bhat, K Jagannathan Operations Research Letters 47 (1), 16-20, 2019 | 55 | 2019 |
Threshold tuning using stochastic optimization for graded signal control LA Prashanth, S Bhatnagar IEEE Transactions on Vehicular Technology 61 (9), 3865-3880, 2012 | 54 | 2012 |
On TD (0) with function approximation: Concentration bounds and a centered variant with exponential convergence N Korda, P La International conference on machine learning, 626-634, 2015 | 51 | 2015 |
Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions LA Prashanth, K Jagannathan, RK Kolla Proceedings of the 37th International Conference on Machine Learning, 5577-5586, 2020 | 50 | 2020 |
Stochastic optimization in a cumulative prospect theory framework C Jie, LA Prashanth, M Fu, S Marcus, C Szepesvári IEEE Transactions on Automatic Control 63 (9), 2867-2882, 2018 | 49 | 2018 |
Risk-sensitive reinforcement learning: A constrained optimization viewpoint LA Prashanth, M Fu arXiv 2018, 2018 | 35 | 2018 |
Adaptive system optimization using random directions stochastic approximation LA Prashanth, S Bhatnagar, M Fu, S Marcus IEEE Transactions on Automatic Control 62 (5), 2223-2238, 2017 | 35 | 2017 |
Risk-sensitive reinforcement learning via policy gradient search LA Prashanth, MC Fu Foundations and Trends® in Machine Learning 15 (5), 537-693, 2022 | 28 | 2022 |
Analysis of stochastic approximation for efficient least squares regression and LSTD LA Prashanth, N Korda, R Munos arXiv preprint arXiv:1306.2557, 2013 | 26* | 2013 |
Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling LA Prashanth, N Korda, R Munos Machine Learning 110 (3), 559-618, 2021 | 17 | 2021 |
(Bandit) Convex Optimization with Biased Noisy Gradient Oracles X Hu, LA Prashanth, A György, C Szepesvári International Conference on Artificial Intelligence and Statistics (AISTATS …, 2016 | 17 | 2016 |