Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey S Narvekar, B Peng, M Leonetti, J Sinapov, ME Taylor, P Stone Journal of Machine Learning Research (JMLR 2020) 21, 1-50, 2020 | 475 | 2020 |
Weighted QMIX: Expanding Monotonic Value Function Factorisation T Rashid, G Farquhar, B Peng, S Whiteson Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS 2020), 2020 | 332* | 2020 |
Interactive learning from policy-dependent human feedback J MacGlashan, MK Ho, R Loftin, B Peng, G Wang, DL Roberts, ME Taylor, ... 34th International Conference on Machine Learning (ICML 2017), 2285-2294, 2017 | 314 | 2017 |
RODE: Learning Roles to Decompose Multi-Agent Tasks T Wang, T Gupta, A Mahajan, B Peng, S Whiteson, C Zhang International Conference on Learning Representations (ICLR 2021), 2020 | 189 | 2020 |
FACMAC: Factored Multi-Agent Centralised Policy Gradients B Peng, T Rashid, CAS de Witt, PA Kamienny, PHS Torr, W Böhmer, ... 35th Conference on Neural Information Processing Systems (NeurIPS 2021), 2021 | 182 | 2021 |
Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning R Loftin, B Peng, J MacGlashan, ML Littman, ME Taylor, J Huang, ... Autonomous agents and multi-agent systems (JAAMAS 2016) 30 (1), 30-59, 2016 | 125 | 2016 |
A strategy-aware technique for learning behaviors from discrete human feedback RT Loftin, J MacGlashan, B Peng, ME Taylor, ML Littman, J Huang, ... Twenty-Eighth AAAI Conference on Artificial Intelligence (AAAI 2014), 2014 | 80 | 2014 |
Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control CS de Witt, B Peng (equal contribution), PA Kamienny, P Torr, W Böhmer, ... arXiv preprint arXiv:2003.06709, 2020 | 76 | 2020 |
Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning S Iqbal, CAS de Witt, B Peng, W Böhmer, S Whiteson, F Sha 38th International Conference on Machine Learning (ICML 2021), 2021 | 74* | 2021 |
A need for speed: Adapting agent action speed to improve task learning from non-expert humans B Peng, J MacGlashan, R Loftin, ML Littman, DL Roberts, ME Taylor Autonomous Agents and Multiagent Systems (AAMAS 2016), 2016 | 57 | 2016 |
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning T Gupta, A Mahajan, B Peng, W Böhmer, S Whiteson 38th International Conference on Machine Learning (ICML 2021), 2021 | 49 | 2021 |
Optimistic Exploration even with a Pessimistic Initialisation T Rashid, B Peng, W Böhmer, S Whiteson International Conference on Learning Representations (ICLR 2020), 2020 | 47 | 2020 |
Regularized Softmax Deep Multi-Agent Q-Learning L Pan, T Rashid, B Peng, L Huang, S Whiteson 35th Conference on Neural Information Processing Systems (NeurIPS 2021), 2021 | 30* | 2021 |
Learning something from nothing: Leveraging implicit human feedback strategies R Loftin, B Peng, J MacGlashan, ML Littman, ME Taylor, J Huang, ... The 23rd IEEE international symposium on robot and human interactive …, 2014 | 30 | 2014 |
Training an agent to ground commands with reward and punishment J MacGlashan, M Littman, R Loftin, B Peng, D Roberts, M Taylor Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014 | 25 | 2014 |
Curriculum Design for Machine Learners in Sequential Decision Tasks B Peng, J MacGlashan, R Loftin, ML Littman, DL Roberts, ME Taylor IEEE Transactions on Emerging Topics in Computational Intelligence 2 (4 …, 2018 | 18 | 2018 |
An empirical study of non-expert curriculum design for machine learners B Peng, J MacGlashan, R Loftin, ML Littman, DL Roberts, ME Taylor Proceedings of the IJCAI Interactive Machine Learning Workshop, 2016 | 14 | 2016 |
Convergent Actor Critic by Humans J MacGlashan, ML Littman, DL Roberts, R Loftin, B Peng, ME Taylor International Conference on Intelligent Robots and Systems (IROS 2016), 2016 | 12 | 2016 |
Towards integrating real-time crowd advice with reinforcement learning GV de la Cruz, B Peng, WS Lasecki, ME Taylor Proceedings of the 20th International Conference on Intelligent User …, 2015 | 10 | 2015 |
Generating real-time crowd advice to improve reinforcement learning agents GV de la Cruz, B Peng, WS Lasecki, ME Taylor Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015 | 4 | 2015 |