Tom Lieberum 个人学术档案 - 学术资源搜索

引用次数

	总计	2019 年至今
引用	261	261
h 指数	4	4
i10 指数	2	2

0

160

80

40

120

2022202320243 103 155

合著作者

Erik JennerUC Berkeley在 berkeley.edu 的电子邮件经过验证

Tom Lieberum

Tom Lieberum

Google DeepMind

在 deepmind.com 的电子邮件经过验证

deep learning large language models interpretability


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Progress measures for grokking via mechanistic interpretability N Nanda, L Chan, T Lieberum, J Smith, J Steinhardt arXiv preprint arXiv:2301.05217, 2023	206	2023
Does circuit analysis interpretability scale? evidence from multiple choice capabilities in chinchilla T Lieberum, M Rahtz, J Kramár, G Irving, R Shah, V Mikulik arXiv preprint arXiv:2307.09458, 2023	38	2023
AtP*: An efficient and scalable method for localizing LLM behaviour to components J Kramár, T Lieberum, R Shah, N Nanda arXiv preprint arXiv:2403.00745, 2024	8	2024
Retrospective on the 2021 minerl BASALT competition on learning from human feedback R Shah, SH Wang, C Wild, S Milani, A Kanervisto, VG Goecks, ... NeurIPS 2021 Competitions and Demonstrations Track, 259-272, 2022	8	2022
Retrospective on the 2021 BASALT Competition on Learning from Human Feedback R Shah, SH Wang, C Wild, S Milani, A Kanervisto, VG Goecks, ... arXiv preprint arXiv:2204.07123, 2022	1	2022
Replication: Fairness without demographics through Adversarially Reweighted Learning E Jenner, T Lieberum, FP Nolte, N Rutsch

系统目前无法执行此操作，请稍后再试。

文章 1–6

共建清朗的网络空间,如遇有害信息,请举报。
本站数据皆整合自互联网公开资源索引,方便科研学术方面查询,并不存储相关数据资源;如对此有异议,请联系我们解决.
© 2023 学术资源搜索 @联系我们 | 申请短期会员 | 数据源提交