An approximate solution method for large risk-averse Markov decision processes M Petrik, D Subramanian arXiv preprint arXiv:1210.4901, 2012 | 187 | 2012 |
Finite-sample analysis of proximal gradient td algorithms B Liu, J Liu, M Ghavamzadeh, S Mahadevan, M Petrik arXiv preprint arXiv:2006.14364, 2020 | 175 | 2020 |
Safe policy improvement by minimizing robust baseline regret M Ghavamzadeh, M Petrik, Y Chow Advances in Neural Information Processing Systems 29, 2016 | 153 | 2016 |
Feature selection using regularization in approximate linear programs for Markov decision processes M Petrik, G Taylor, R Parr, S Zilberstein arXiv preprint arXiv:1005.1860, 2010 | 91 | 2010 |
An Analysis of Laplacian Methods for Value Function Approximation in MDPs. M Petrik IJCAI, 2574-2579, 2007 | 91 | 2007 |
Biasing approximate dynamic programming with a lower discount factor M Petrik, B Scherrer Advances in neural information processing systems 21, 2008 | 70 | 2008 |
Fast Bellman updates for robust MDPs CP Ho, M Petrik, W Wiesemann International Conference on Machine Learning, 1979-1988, 2018 | 68 | 2018 |
Beyond confidence regions: Tight bayesian ambiguity sets for robust mdps M Petrik, RH Russel Advances in neural information processing systems 32, 2019 | 67 | 2019 |
A practical method for solving contextual bandit problems using decision trees AN Elmachtoub, R McNellis, S Oh, M Petrik arXiv preprint arXiv:1706.04687, 2017 | 61 | 2017 |
Learning parallel portfolios of algorithms M Petrik, S Zilberstein Annals of Mathematics and Artificial Intelligence 48, 85-106, 2006 | 61 | 2006 |
Partial policy iteration for l1-robust markov decision processes CP Ho, M Petrik, W Wiesemann Journal of Machine Learning Research 22 (275), 1-46, 2021 | 55 | 2021 |
Tight approximations of dynamic risk measures DA Iancu, M Petrik, D Subramanian Mathematics of Operations Research 40 (3), 655-682, 2015 | 46 | 2015 |
Constraint relaxation in approximate linear programs M Petrik, S Zilberstein Proceedings of the 26th Annual International Conference on Machine Learning …, 2009 | 46 | 2009 |
A bilinear programming approach for multiagent planning M Petrik, S Zilberstein Journal of Artificial Intelligence Research 35, 235-274, 2009 | 45 | 2009 |
RAAM: The benefits of robustness in approximating aggregated MDPs in reinforcement learning M Petrik, D Subramanian Advances in Neural Information Processing Systems 27, 2014 | 44 | 2014 |
Bayesian robust optimization for imitation learning D Brown, S Niekum, M Petrik Advances in Neural Information Processing Systems 33, 2479-2491, 2020 | 38 | 2020 |
Average-Reward Decentralized Markov Decision Processes. M Petrik, S Zilberstein IJCAI, 1997-2002, 2007 | 38 | 2007 |
Social media and customer behavior analytics for personalized customer engagements S Buckley, M Ettl, P Jain, R Luss, M Petrik, RK Ravi, C Venkatramani IBM Journal of Research and Development 58 (5/6), 7: 1-7: 12, 2014 | 33 | 2014 |
Proximal Gradient Temporal Difference Learning Algorithms. B Liu, J Liu, M Ghavamzadeh, S Mahadevan, M Petrik IJCAI, 4195-4199, 2016 | 32 | 2016 |
Anytime coordination using separable bilinear programs M Petrik, S Zilberstein PROCEEDINGS OF THE NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE 22 (1), 750, 2007 | 32 | 2007 |