Mohammad Gheshlaghi Azar 个人学术档案

引用次数

	总计	2019 年至今
引用	12616	12108
h 指数	27	26
i10 指数	34	33

3700

1850

925

2775

201520162017201820192020202120222023202441 52 76 261 563 839 1812 2968 3692 2219

开放获取的出版物数量

查看全部

3 篇文章

0 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

合著作者

Rémi MunosGoogle DeepMind在 inria.fr 的电子邮件经过验证
Bilal PiotGoogle Deepmind在 google.com 的电子邮件经过验证
Michal ValkoLlama @ Meta Paris & Inria & MVA - Ex: Gemini and BYOL @ Google DeepMind在 meta.com 的电子邮件经过验证
Zhaohan Daniel GuoDeepMind在 google.com 的电子邮件经过验证
Florent AltchéResearch Engineer, DeepMind在 google.com 的电子邮件经过验证
Jean-bastien Grill在 google.com 的电子邮件经过验证
Corentin TallecDeepMind在 google.com 的电子邮件经过验证
Hado van HasseltResearch Scientist, DeepMind; Honorary Professor, UCL在 google.com 的电子邮件经过验证
Florian STRUBCohere在 cohere.com 的电子邮件经过验证
Pierre RichemondGoogle DeepMind在 deepmind.com 的电子邮件经过验证
Hilbert Johan KappenRadboud University在 science.ru.nl 的电子邮件经过验证
Will DabneyDeepMind在 google.com 的电子邮件经过验证
Elena BuchatskayaResearch Engineer, Google DeepMind在 google.com 的电子邮件经过验证
Matteo HesselResearch Engineer, Google DeepMind在 google.com 的电子邮件经过验证
Dan HorganGoogle DeepMind在 google.com 的电子邮件经过验证
Eva L. DyerGeorgia Institute of Technology在 gatech.edu 的电子邮件经过验证
Mark RowlandResearch Scientist, Google DeepMind在 google.com 的电子邮件经过验证
Shantanu ThakoorResearch Engineer at DeepMind在 google.com 的电子邮件经过验证
Carl DoerschGoogle DeepMind在 google.com 的电子邮件经过验证
Tom SchaulSenior Staff Scientist, DeepMind在 nyu.edu 的电子邮件经过验证

关注

Mohammad Gheshlaghi Azar

Cohere

在 google.com 的电子邮件经过验证 - 首页

RL for Generative AI Self-Supervised Learning Exploration Optimization


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020	5982	2020
Rainbow: Combining improvements in deep reinforcement learning M Hessel, J Modayil, H Van Hasselt, T Schaul, G Ostrovski, W Dabney, ... Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018	2602	2018
Minimax regret bounds for reinforcement learning MG Azar, I Osband, R Munos International conference on machine learning, 263-272, 2017	814	2017
koray kavukcuoglu, Remi Munos, and Michal Valko. Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020	445	2020
Large-scale representation learning on graphs via bootstrapping S Thakoor, C Tallec, MG Azar, M Azabou, EL Dyer, R Munos, P Veličković, ... arXiv preprint arXiv:2102.06514, 2021	376*	2021
Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model M Gheshlaghi Azar, R Munos, HJ Kappen Machine learning 91, 325-349, 2013	294	2013
Speedy Q-Learning MG Azar, M Ghavamzadeh, HJ Kappen, R Munos Advances in Neural Information Processing Systems, 2411-2419, 2011	206*	2011
The reactor: A fast and sample-efficient actor-critic agent for reinforcement learning A Gruslys, W Dabney, MG Azar, B Piot, M Bellemare, R Munos arXiv preprint arXiv:1704.04651, 2017	166*	2017
Dynamic Policy Programming M Gheshlaghi Azar, V Gomez, HJ Kappen Journal of Machine Learning Research 13, 3207-3245, 2012	150	2012
Bootstrap latent-predictive representations for multitask reinforcement learning ZD Guo, BA Pires, B Piot, JB Grill, F Altché, R Munos, MG Azar International Conference on Machine Learning, 3875-3886, 2020	143	2020
Observe and look further: Achieving consistent performance on atari T Pohlen, B Piot, T Hester, MG Azar, D Horgan, D Budden, G Barth-Maron, ... arXiv preprint arXiv:1805.11593, 2018	136	2018
A general theoretical paradigm to understand learning from human preferences MG Azar, ZD Guo, B Piot, R Munos, M Rowland, M Valko, D Calandriello International Conference on Artificial Intelligence and Statistics, 4447-4455, 2024	134	2024
Sequential transfer in multi-armed bandit with finite set of models MG Azar, A Lazaric, E Brunskill Advances in Neural Information Processing Systems, 2220-2228, 2013	118	2013
On the sample complexity of reinforcement learning with a generative model MG Azar, R Munos, B Kappen arXiv preprint arXiv:1206.6461, 2012	114	2012
Hindsight credit assignment A Harutyunyan, W Dabney, T Mesnard, M Gheshlaghi Azar, B Piot, ... Advances in neural information processing systems 32, 2019	95	2019
Neural predictive belief representations ZD Guo, MG Azar, B Piot, BA Pires, R Munos arXiv preprint arXiv:1811.06407, 2018	88	2018
Meta-learning of sequential strategies PA Ortega, JX Wang, M Rowland, T Genewein, Z Kurth-Nelson, ... arXiv preprint arXiv:1905.03030, 2019	87	2019
Stochastic optimization of a locally smooth function under correlated bandit feedback MG Azar, A Lazaric, E Brunskill 31st International Conference on Machine Learning (ICML), 2014	66*	2014
A cryptography-based approach for movement decoding EL Dyer, M Gheshlaghi Azar, MG Perich, HL Fernandes, S Naufel, ... Nature biomedical engineering 1 (12), 967-976, 2017	63	2017
Byol-explore: Exploration by bootstrapped prediction Z Guo, S Thakoor, M Pîslar, B Avila Pires, F Altché, C Tallec, A Saade, ... Advances in neural information processing systems 35, 31855-31870, 2022	58	2022

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

上传 PDF

关注此作者

引用次数

合著作者

引用