Nice: Non-linear independent components estimation L Dinh, D Krueger, Y Bengio arXiv preprint arXiv:1410.8516, 2014 | 2300 | 2014 |
A closer look at memorization in deep networks D Krueger, N Ballas, S Jastrzebski, D Arpit, MS Kanwal, T Maharaj, ... International Conference on Machine Learning (ICML) 2017, 2017 | 1917* | 2017 |
Out-of-distribution generalization via risk extrapolation (rex) D Krueger, E Caballero, JH Jacobsen, A Zhang, J Binas, D Zhang, ... International Conference on Machine Learning, 5815-5826, 2021 | 792 | 2021 |
Neural autoregressive flows CW Huang, D Krueger, A Lacoste, A Courville International Conference on Machine Learning (ICML) 2018, 2018 | 503 | 2018 |
Zoneout: Regularizing rnns by randomly preserving hidden activations D Krueger, T Maharaj, J Kramár, M Pezeshki, N Ballas, NR Ke, A Goyal, ... International Conference on Learning Representations (ICLR) 2017, 2016 | 380 | 2016 |
Toward trustworthy AI development: mechanisms for supporting verifiable claims M Brundage, S Avin, J Wang, H Belfield, G Krueger, G Hadfield, H Khlaaf, ... arXiv preprint arXiv:2004.07213, 2020 | 346 | 2020 |
Scalable agent alignment via reward modeling: a research direction J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg arXiv preprint arXiv:1811.07871, 2018 | 257 | 2018 |
Open problems and fundamental limitations of reinforcement learning from human feedback S Casper, X Davies, C Shi, TK Gilbert, J Scheurer, J Rando, R Freedman, ... arXiv preprint arXiv:2307.15217, 2023 | 209 | 2023 |
Bayesian hypernetworks D Krueger, CW Huang, R Islam, R Turner, A Lacoste, A Courville arXiv preprint arXiv:1710.04759, 2017 | 165 | 2017 |
Defining and characterizing reward gaming J Skalse, N Howe, D Krasheninnikov, D Krueger Advances in Neural Information Processing Systems 35, 9460-9471, 2022 | 120 | 2022 |
Zero-bias autoencoders and the benefits of co-adapting features K Konda, R Memisevic, D Krueger International Conference on Learning Representations (ICLR) 2015, 2014 | 106* | 2014 |
Nested lstms JRA Moniz, D Krueger Asian Conference on Machine Learning, 530-544, 2017 | 87 | 2017 |
Regularizing rnns by stabilizing activations D Krueger, R Memisevic International Conference on Learning Representations (ICLR) 2016, 2015 | 81 | 2015 |
Goal misgeneralization in deep reinforcement learning LL Di Langosco, J Koch, LD Sharkey, J Pfau, D Krueger International Conference on Machine Learning, 12004-12019, 2022 | 75 | 2022 |
Hidden Incentives for Auto-Induced Distributional Shift D Krueger, T Maharaj, J Leike arXiv preprint arXiv:2009.09153, 2020 | 58* | 2020 |
Broken neural scaling laws E Caballero, K Gupta, I Rish, D Krueger arXiv preprint arXiv:2210.14891, 2022 | 50 | 2022 |
AI research considerations for human existential safety (ARCHES) A Critch, D Krueger arXiv preprint arXiv:2006.04948, 2020 | 50 | 2020 |
Managing ai risks in an era of rapid progress Y Bengio, G Hinton, A Yao, D Song, P Abbeel, YN Harari, YQ Zhang, ... arXiv preprint arXiv:2310.17688, 2023 | 44 | 2023 |
Harms from increasingly agentic algorithmic systems A Chan, R Salganik, A Markelius, C Pang, N Rajkumar, D Krasheninnikov, ... Proceedings of the 2023 ACM Conference on Fairness, Accountability, and …, 2023 | 36 | 2023 |
Mechanistic mode connectivity ES Lubana, EJ Bigelow, RP Dick, D Krueger, H Tanaka International Conference on Machine Learning, 22965-23004, 2023 | 35 | 2023 |