Learning inverse dynamics models in o (n) time with lstm networks E Rueckert, M Nakatenus, S Tosatto, J Peters 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids …, 2017 | 79 | 2017 |
Boosted Fitted Q-Iteration S Tosatto, DE Carlo, P Matteo, R Marcello International Conference of Machine Learning, 2017 | 47 | 2017 |
Contextual latent-movements off-policy optimization for robotic manipulation skills S Tosatto, G Chalvatzaki, J Peters 2021 IEEE international conference on robotics and automation (ICRA), 10815 …, 2021 | 17 | 2021 |
A Nonparametric Off-Policy Policy Gradient S Tosatto, J Carvalho, H Abdulsamad, J Peters International Conference on Artificial Intelligence and Statistics (AISTATS), 2020 | 14 | 2020 |
Model-free Policy Learning with Reward Gradients Q Lan, S Tosatto, H Farrahi, A Mahmood arXiv preprint arXiv:2103.05147, 2021 | 9 | 2021 |
Dynamic Decision Frequency with Continuous Options A Karimi, J Jin, J Luo, AR Mahmood, M Jagersand, S Tosatto 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems …, 2023 | 5 | 2023 |
An alternate policy gradient estimator for softmax policies S Garg, S Tosatto, Y Pan, M White, AR Mahmood arXiv preprint arXiv:2112.11622, 2021 | 5 | 2021 |
Batch reinforcement learning with a nonparametric off-policy policy gradient S Tosatto, J Carvalho, J Peters IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (10), 5996 …, 2021 | 5 | 2021 |
An upper bound of the bias of Nadaraya-Watson kernel regression under Lipschitz assumptions S Tosatto, R Akrour, J Peters Stats 4 (1), 1-17, 2020 | 5 | 2020 |
Exploration Driven By an Optimistic Bellman Equation S Tosatto, C D'Eramo, J Pajarinen, M Restelli, J Peters International Joint Conference on Neural Networks, 2019 | 5 | 2019 |
Deep probabilistic movement primitives with a bayesian aggregator M Przystupa, F Haghverd, M Jagersand, S Tosatto 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems …, 2023 | 3 | 2023 |
A temporal-difference approach to policy gradient estimation S Tosatto, A Patterson, M White, R Mahmood International Conference on Machine Learning, 21609-21632, 2022 | 2 | 2022 |
A Gradient Critic for Policy Gradient Estimation S Tosatto, A Patterson, M White, AR Mahmood Sixteenth European Workshop on Reinforcement Learning, 2023 | 1 | 2023 |
Variable-Decision Frequency Option Critic. A Karimi, J Jin, J Luo, AR Mahmood, M Jägersand, S Tosatto CoRR, 2022 | | 2022 |
Off-Policy Reinforcement Learning for Robotics S Tosatto Technische Universität Darmstadt, 2021 | | 2021 |
Dimensionality Reduction of Movement Primitives in Parameter Space S Tosatto, J Stadtmüller, J Peters arXiv preprint arXiv:2003.02634, 2020 | | 2020 |
An Upper Bound of the Bias of Nadaraya-Watson Kernel Regression under Lipschitz Assumptions. Stats 2021, 4, 1–17 S Tosatto, R Akrour, J Peters s Note: MDPI stays neu-tral with regard to jurisdictional clai-ms in …, 2020 | | 2020 |
Technical Report:“Exploration Driven by an Optimistic Bellman Equation” S Tosatto, C D’Eramo, J Pajarinen, M Restelli, J Peters | | 2018 |
Pink Noise LQR: How does Colored Noise affect the Optimal Policy in RL? J Hollenstein, M Zaric, S Tosatto, J Piater ICML 2024 Workshop: Foundations of Reinforcement Learning and Control …, 0 | | |
Making Policy Gradient Estimators for Softmax Policies More Robust to Non-stationarities S Garg, S Tosatto, Y Pan, M White, AR Mahmood | | |