Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 931 | 2023 |
What matters for on-policy deep actor-critic methods? a large-scale study M Andrychowicz, A Raichuk, P Stańczyk, M Orsini, S Girgin, R Marinier, ... International conference on learning representations, 2021 | 392* | 2021 |
A theory of regularized markov decision processes M Geist, B Scherrer, O Pietquin International Conference on Machine Learning, 2160-2169, 2019 | 308 | 2019 |
Human activity recognition using recurrent neural networks D Singh, E Merdivan, I Psychoula, J Kropf, S Hanke, M Geist, A Holzinger Machine Learning and Knowledge Extraction: First IFIP TC 5, WG 8.4, 8.9, 12 …, 2017 | 208 | 2017 |
Approximate modified policy iteration and its application to the game of Tetris. B Scherrer, M Ghavamzadeh, V Gabillon, B Lesner, M Geist J. Mach. Learn. Res. 16 (49), 1629-1676, 2015 | 153 | 2015 |
IQ-Learn: Inverse soft-Q Learning for Imitation D Garg, S Chakraborty, C Cundy, J Song, M Geist, S Ermon arXiv preprint arXiv:2106.12142, 2022 | 129 | 2022 |
Primal wasserstein imitation learning R Dadashi, L Hussenot, M Geist, O Pietquin arXiv preprint arXiv:2006.04678, 2020 | 128 | 2020 |
Inverse reinforcement learning through structured classification E Klein, M Geist, B Piot, O Pietquin Advances in neural information processing systems 25, 2012 | 125 | 2012 |
Kalman temporal differences M Geist, O Pietquin Journal of artificial intelligence research 39, 483-532, 2010 | 123 | 2010 |
Algorithmic survey of parametric value function approximation M Geist, O Pietquin IEEE Transactions on Neural Networks and Learning Systems 24 (6), 845-867, 2013 | 122* | 2013 |
On the convergence of model free learning in mean field games R Elie, J Perolat, M Laurière, M Geist, O Pietquin Proceedings of the AAAI Conference on Artificial Intelligence 34 (05), 7143-7150, 2020 | 121* | 2020 |
Sample-efficient batch reinforcement learning for dialogue management optimization O Pietquin, M Geist, S Chandramohan, H Frezza-Buet ACM Transactions on Speech and Language Processing (TSLP) 7 (3), 1-21, 2011 | 120 | 2011 |
Fictitious play for mean field games: Continuous time analysis and applications S Perrin, J Pérolat, M Laurière, M Geist, R Elie, O Pietquin Advances in neural information processing systems 33, 13199-13213, 2020 | 119 | 2020 |
User simulation in dialogue systems using inverse reinforcement learning S Chandramohan, M Geist, F Lefevre, O Pietquin Interspeech 2011, 1025-1028, 2011 | 118 | 2011 |
Off-policy learning with eligibility traces: a survey. M Geist, B Scherrer J. Mach. Learn. Res. 15 (1), 289-333, 2014 | 109 | 2014 |
Bridging the gap between imitation learning and inverse reinforcement learning B Piot, M Geist, O Pietquin IEEE transactions on neural networks and learning systems 28 (8), 1814-1826, 2016 | 108 | 2016 |
Leverage the average: an analysis of kl regularization in reinforcement learning N Vieillard, T Kozuno, B Scherrer, O Pietquin, R Munos, M Geist Advances in Neural Information Processing Systems 33, 12163-12174, 2020 | 107* | 2020 |
Convolutional and recurrent neural networks for activity recognition in smart environment D Singh, E Merdivan, S Hanke, J Kropf, M Geist, A Holzinger Towards Integrative Machine Learning and Knowledge Extraction: BIRS Workshop …, 2017 | 96 | 2017 |
Munchausen reinforcement learning N Vieillard, O Pietquin, M Geist Advances in Neural Information Processing Systems 33, 4235-4246, 2020 | 92 | 2020 |
Boosted bellman residual minimization handling expert demonstrations B Piot, M Geist, O Pietquin Machine Learning and Knowledge Discovery in Databases: European Conference …, 2014 | 91 | 2014 |