Analysis of thompson sampling for the multi-armed bandit problem S Agrawal, N Goyal Conference on learning theory, 39.1-39.26, 2012 | 1551 | 2012 |
Thompson sampling for contextual bandits with linear payoffs S Agrawal, N Goyal International conference on machine learning, 127-135, 2013 | 1177 | 2013 |
Near-optimal regret bounds for thompson sampling S Agrawal, N Goyal Journal of the ACM (JACM) 64 (5), 1-24, 2017 | 667 | 2017 |
A dynamic near-optimal algorithm for online linear programming S Agrawal, Z Wang, Y Ye Operations Research 62 (4), 876-890, 2014 | 346 | 2014 |
Optimistic posterior sampling for reinforcement learning: worst-case regret bounds S Agrawal, R Jia Advances in Neural Information Processing Systems 30, 2017 | 241 | 2017 |
A framework for high-accuracy privacy-preserving mining S Agrawal, JR Haritsa 21st International Conference on Data Engineering (ICDE'05), 193-204, 2005 | 241 | 2005 |
A near-optimal exploration-exploitation approach for assortment selection S Agrawal, V Avadhanula, V Goyal, A Zeevi Proceedings of the 2016 ACM Conference on Economics and Computation, 599-600, 2016 | 227* | 2016 |
Bandits with concave rewards and convex knapsacks S Agrawal, NR Devanur Proceedings of the fifteenth ACM conference on Economics and computation …, 2014 | 226 | 2014 |
Reinforcement learning for integer programming: Learning to cut Y Tang, S Agrawal, Y Faenza International conference on machine learning, 9367-9376, 2020 | 203 | 2020 |
Fast Algorithms for Online Stochastic Convex Programming S Agrawal, NR Devanur SODA 2015, 2015 | 193 | 2015 |
Price of correlations in stochastic optimization S Agrawal, Y Ding, A Saberi, Y Ye Operations Research 60 (1), 150-162, 2012 | 169* | 2012 |
Linear contextual bandits with knapsacks S Agrawal, N Devanur Advances in neural information processing systems 29, 2016 | 164 | 2016 |
Bandits with delayed, aggregated anonymous feedback C Pike-Burke, S Agrawal, C Szepesvari, S Grunewalder International Conference on Machine Learning, 4105-4113, 2018 | 128 | 2018 |
Thompson sampling for the mnl-bandit S Agrawal, V Avadhanula, V Goyal, A Zeevi Conference on learning theory, 76-78, 2017 | 122 | 2017 |
Discretizing continuous action space for on-policy optimization Y Tang, S Agrawal Proceedings of the aaai conference on artificial intelligence 34 (04), 5981-5988, 2020 | 121 | 2020 |
On addressing efficiency concerns in privacy-preserving mining S Agrawal, V Krishnan, JR Haritsa Database Systems for Advanced Applications: 9th International Conference …, 2004 | 111 | 2004 |
An efficient algorithm for contextual bandits with knapsacks, and an extension to concave objectives S Agrawal, NR Devanur, L Li Conference on Learning Theory, 4-18, 2016 | 107 | 2016 |
Learning in structured mdps with convex cost functions: Improved regret bounds for inventory management S Agrawal, R Jia Proceedings of the 2019 ACM Conference on Economics and Computation, 743-744, 2019 | 76 | 2019 |
Efficient detection of distributed constraint violations S Agrawal, S Deb, KVM Naidu, R Rastogi 2007 IEEE 23rd International Conference on Data Engineering, 1320-1324, 2006 | 76 | 2006 |
A unified framework for dynamic prediction market design S Agrawal, E Delage, M Peters, Z Wang, Y Ye Operations research 59 (3), 550-568, 2011 | 67* | 2011 |