Deep reinforcement learning from human preferences PF Christiano, J Leike, T Brown, M Martic, S Legg, D Amodei Advances in neural information processing systems 30, 2017 | 2448 | 2017 |
AI safety gridworlds J Leike, M Martic, V Krakovna, PA Ortega, T Everitt, A Lefrancq, L Orseau, ... arXiv preprint arXiv:1711.09883, 2017 | 328 | 2017 |
Scalable agent alignment via reward modeling: a research direction J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg arXiv preprint arXiv:1811.07871, 2018 | 275 | 2018 |
Penalizing side effects using stepwise relative reachability V Krakovna, L Orseau, R Kumar, M Martic, S Legg arXiv preprint arXiv:1806.01186, 2018 | 56 | 2018 |
Avoiding side effects by considering future tasks V Krakovna, L Orseau, R Ngo, M Martic, S Legg Advances in Neural Information Processing Systems 33, 19064-19074, 2020 | 43 | 2020 |
Meta-trained agents implement bayes-optimal agents V Mikulik, G Delétang, T McGrath, T Genewein, M Martic, S Legg, ... Advances in neural information processing systems 33, 18691-18703, 2020 | 38 | 2020 |
Algorithms for causal reasoning in probability trees T Genewein, T McGrath, G Delétang, V Mikulik, M Martic, S Legg, ... arXiv preprint arXiv:2010.12237, 2020 | 20 | 2020 |
Measuring and avoiding side effects using relative reachability V Krakovna, L Orseau, M Martic, S Legg arXiv preprint arXiv:1806.01186, 2018 | 20 | 2018 |
Scalable agent alignment via reward modeling: A research direction. arXiv 2018 J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg arXiv preprint arXiv:1811.07871, 1811 | 15 | 1811 |
Causal analysis of agent behavior for ai safety G Déletang, J Grau-Moya, M Martic, T Genewein, T McGrath, V Mikulik, ... arXiv preprint arXiv:2103.03938, 2021 | 10 | 2021 |
Scaling shared model governance via model splitting M Martic, J Leike, A Trask, M Hessel, S Legg, P Kohli arXiv preprint arXiv:1812.05979, 2018 | 3 | 2018 |
AI safety gridworlds. CoRR abs/1711.09883 (2017) J Leike, M Martic, V Krakovna, PA Ortega, T Everitt, A Lefrancq, L Orseau, ... arXiv preprint arXiv:1711.09883, 2017 | 3 | 2017 |
Twitter sentiment analysis for foreign exchange market movement prediction M Martic | | 2014 |