Prashanth L.A. 个人学术档案

引用次数

	总计	2019 年至今
引用	2229	1468
h 指数	18	17
i10 指数	31	27

380

190

285

201120122013201420152016201720182019202020212022202320249 15 50 64 80 99 76 127 180 200 272 287 379 149

开放获取的出版物数量

查看全部

19 篇文章

0 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

合著作者

Shalabh BhatnagarProfessor in the Department of Computer Science and Automation, Indian Institute of Science在 iisc.ac.in 的电子邮件经过验证
Michael C. FuUniversity of Maryland在 umd.edu 的电子邮件经过验证
Mohammad GhavamzadehAmazon在 amazon.com 的电子邮件经过验证
Krishna JagannathanProfessor, Department of Electrical Engineering, IIT Madras在 ee.iitm.ac.in 的电子邮件经过验证
H L PrasadChairman and CTO at Astrome Technologies在 csa.iisc.ernet.in 的电子邮件经过验证
Rémi MunosGoogle DeepMind在 inria.fr 的电子邮件经过验证
Ravi Kumar KollaIIT Madras在 ee.iitm.ac.in 的电子邮件经过验证
Csaba SzepesvariDeepMind & University of Alberta在 cs.ualberta.ca 的电子邮件经过验证
Sanjay P. BhatTata Consultancy Services Limited在 tcs.com 的电子邮件经过验证
Cheng JiePinterest LLC, University of Maryland, College Park, Walmart Global Tech在 pinterest.com 的电子邮件经过验证
Nirmit DesaiIBM Research在 us.ibm.com 的电子邮件经过验证
Nirav BhavsarM.S. Scholar in the Department of Computer Science and Engineering, Indian Institute of Technology在 cse.iitm.ac.in 的电子邮件经过验证
Nithia VijayanResearch Fellow, School of Computing, National University of Singapore在 comp.nus.edu.sg 的电子邮件经过验证
Aditya GopalanIndian Institute of Science, Bangalore在 iisc.ac.in 的电子邮件经过验证
Doina PrecupDeepMind and McGill University在 cs.mcgill.ca 的电子邮件经过验证
gargi dasguptaIBM Research Lab在 in.ibm.com 的电子邮件经过验证
Gandharv PatilMcGill University, Mila在 mail.mcgill.ca 的电子邮件经过验证
Dheeraj NagarajResearch Scientist, Google在 google.com 的电子邮件经过验证
Steven I. MarcusProfessor of Electrical and Computer Engineering, University of Maryland在 umd.edu 的电子邮件经过验证
Andras GyorgyDeepMind在 google.com 的电子邮件经过验证

关注

Prashanth L.A.

Associate Professor, Department of Computer Science and Engg., IIT Madras

在 cse.iitm.ac.in 的电子邮件经过验证 - 首页

Reinforcement learning simulation optimization multi-armed bandits


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods S Bhatnagar, HL Prasad, LA Prashanth Springer 434, 302, 2013	419*	2013
Reinforcement Learning With Function Approximation for Traffic Signal Control P LA, S Bhatnagar Intelligent Transportation Systems, IEEE Transactions on, 1-10, 2011	381	2011
Actor-critic algorithms for risk-sensitive MDPs P La, M Ghavamzadeh Advances in neural information processing systems 26, 2013	308	2013
Reinforcement learning with average cost for adaptive control of traffic lights at intersections LA Prashanth, S Bhatnagar 2011 14th International IEEE Conference on Intelligent Transportation …, 2011	89	2011
Cumulative prospect theory meets reinforcement learning: Prediction and control LA Prashanth, C Jie, M Fu, S Marcus, C Szepesvári International Conference on Machine Learning, 1406-1415, 2016	85	2016
Variance-Constrained Actor-Critic Algorithms for Discounted and Average Reward MDPs LA Prashanth, M Ghavamzadeh arXiv preprint arXiv:1403.6530, 2014	81	2014
Policy gradients for CVaR-constrained MDPs LA Prashanth International Conference on Algorithmic Learning Theory, 155-169, 2014	71	2014
Two-timescale algorithms for learning Nash equilibria in general-sum stochastic games HL Prasad, P LA, S Bhatnagar Proceedings of the 2015 International Conference on Autonomous Agents and …, 2015	70	2015
Concentration of risk measures: A Wasserstein distance approach SP Bhat, P LA Advances in neural information processing systems 32, 2019	55	2019
Concentration bounds for empirical conditional value-at-risk: The unbounded case RK Kolla, LA Prashanth, SP Bhat, K Jagannathan Operations Research Letters 47 (1), 16-20, 2019	55	2019
Threshold tuning using stochastic optimization for graded signal control LA Prashanth, S Bhatnagar IEEE Transactions on Vehicular Technology 61 (9), 3865-3880, 2012	54	2012
On TD (0) with function approximation: Concentration bounds and a centered variant with exponential convergence N Korda, P La International conference on machine learning, 626-634, 2015	51	2015
Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions LA Prashanth, K Jagannathan, RK Kolla Proceedings of the 37th International Conference on Machine Learning, 5577-5586, 2020	50	2020
Stochastic optimization in a cumulative prospect theory framework C Jie, LA Prashanth, M Fu, S Marcus, C Szepesvári IEEE Transactions on Automatic Control 63 (9), 2867-2882, 2018	49	2018
Risk-sensitive reinforcement learning: A constrained optimization viewpoint LA Prashanth, M Fu arXiv 2018, 2018	35	2018
Adaptive system optimization using random directions stochastic approximation LA Prashanth, S Bhatnagar, M Fu, S Marcus IEEE Transactions on Automatic Control 62 (5), 2223-2238, 2017	35	2017
Risk-sensitive reinforcement learning via policy gradient search LA Prashanth, MC Fu Foundations and Trends® in Machine Learning 15 (5), 537-693, 2022	28	2022
Analysis of stochastic approximation for efficient least squares regression and LSTD LA Prashanth, N Korda, R Munos arXiv preprint arXiv:1306.2557, 2013	26*	2013
Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling LA Prashanth, N Korda, R Munos Machine Learning 110 (3), 559-618, 2021	17	2021
(Bandit) Convex Optimization with Biased Noisy Gradient Oracles X Hu, LA Prashanth, A György, C Szepesvári International Conference on Artificial Intelligence and Statistics (AISTATS …, 2016	17	2016

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

上传 PDF

关注此作者

引用次数

合著作者

引用