Ridm: Reinforced inverse dynamics modeling for learning from a single observed demonstration BS Pavse, F Torabi, JP Hanna, G Warnell, P Stone IEEE Robotics and Automation Letters, presented at International Conference …, 2019 | 35 | 2019 |
Reducing Sampling Error in Batch Temporal Difference Learning BS Pavse, I Durugkar, JP Hanna, P Stone International Conference on Machine Learning (ICML), 2020 | 15 | 2020 |
UT Austin Villa: RoboCup 2018 3D simulation league champions P MacAlpine, F Torabi, B Pavse, J Sigmon, P Stone RoboCup 2018: Robot World Cup XXII 22, 462-475, 2019 | 10 | 2019 |
UT Austin Villa: RoboCup 2019 3D Simulation League Competition and Technical Challenge Champions P MacAlpine, F Torabi, B Pavse, P Stone | 8 | 2019 |
State-action similarity-based representations for off-policy evaluation B Pavse, J Hanna Advances in Neural Information Processing Systems 36, 2024 | 2 | 2024 |
Scaling marginalized importance sampling to high-dimensional state-spaces via state abstraction BS Pavse, JP Hanna Proceedings of the AAAI Conference on Artificial Intelligence 37 (8), 9417-9425, 2023 | 2 | 2023 |
Tackling Unbounded State Spaces in Continuing Task Reinforcement Learning. BS Pavse, Y Chen, Q Xie, JP Hanna arXiv preprint arXiv:2306.01896, 2023 | 2 | 2023 |
Learning to stabilize online reinforcement learning in unbounded state spaces BS Pavse, M Zurek, Y Chen, Q Xie, JP Hanna arXiv preprint arXiv:2306.01896, 2023 | 1 | 2023 |
Replacing Implicit Regression with Classification in Policy Gradient Reinforcement Learning JP Hanna, BS Pavse, AN Harish Finding the Frame: An RLC Workshop for Examining Conceptual Frameworks, 0 | | |
Revisiting Familiar Places in an Infinite World: Continuing RL in Unbounded State Spaces BS Pavse, Y Chen, Q Xie, JP Hanna | | |
On Sampling Error in Batch Action-Value Prediction Algorithms BS Pavse, JP Hanna, I Durugkar, P Stone | | |