Self-modification of policy and utility function in rational agents T Everitt, D Filan, M Daswani, M Hutter Artificial General Intelligence: 9th International Conference, AGI 2016, New …, 2016 | 31 | 2016 |
Clusterability in neural networks D Filan, S Casper, S Hod, C Wild, A Critch, S Russell arXiv preprint arXiv:2103.03386, 2021 | 27 | 2021 |
Modeling agents with probabilistic programs O Evans, A Stuhlmüller, J Salvatier, D Filan URL: http://agentmodels. org, 2017 | 25 | 2017 |
Pruned Neural Networks are Surprisingly Modular D Filan, S Hod, C Wild, A Critch, S Russell arXiv preprint arXiv:2003.04881, 2020 | 18 | 2020 |
Graphical clusterability and local specialization in deep neural networks S Casper, S Hod, D Filan, C Wild, A Critch, S Russell ICLR 2022 Workshop on PAIR {\textasciicircum} 2Struct: Privacy …, 2022 | 9 | 2022 |
Detecting modularity in deep neural networks S Hod, S Casper, D Filan, C Wild, A Critch, S Russell | 9 | 2021 |
Quantifying local specialization in deep neural networks S Hod, D Filan, S Casper, A Critch, S Russell arXiv preprint arXiv:2110.08058, 2021 | 8 | 2021 |
Loss bounds and time complexity for speed priors D Filan, J Leike, M Hutter Artificial Intelligence and Statistics, 1394-1402, 2016 | 6 | 2016 |
Exploring hierarchy-aware inverse reinforcement learning C Cundy, D Filan arXiv preprint arXiv:1807.05037, 2018 | 4 | 2018 |
Importance and Coherence: Methods for Evaluating Modularity in Neural Networks S Hod, S Casper, D Filan, C Wild, A Critch, S Russell | 2 | 2020 |
A nice representation of the Laplacian D Filan | | 2022 |
Agents Using Speed Priors D Filan | | 2015 |
What would it have looked like if it looked like I were in a superposition? D Filan, JJ Hope arXiv preprint arXiv:1509.04398, 2015 | | 2015 |
Loss Bounds and Time Complexity for Speed Priors: Supplementary Material D Filan, J Leike, M Hutter | | |