Peter Hase 个人学术档案 - 学术资源搜索

引用次数

	总计	2019 年至今
引用	1408	1408
h 指数	16	16
i10 指数	18	18

620

310

155

465

2020202120222023202422 107 236 433 605

开放获取的出版物数量

查看全部

3 篇文章

0 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

合著作者

Mohit BansalParker Distinguished Professor, Computer Science, UNC Chapel Hill在 cs.unc.edu 的电子邮件经过验证
Swarnadeep SahaPhD Student, University of North Carolina at Chapel Hill在 cs.unc.edu 的电子邮件经过验证
Cynthia RudinProfessor of Computer Science, ECE, Statistics, and Biostatistics & Bioinformatics, Duke University在 cs.duke.edu 的电子邮件经过验证
Shiyue ZhangBloomberg AI在 cs.unc.edu 的电子邮件经过验证
Srini IyerFAIR在 fb.com 的电子邮件经过验证
Asma GhandehariounResearch Scientist, Google Research在 google.com 的电子邮件经过验证
Been KimGoogle DeepMind在 csail.mit.edu 的电子邮件经过验证
Zhuofan YingColumbia University在 columbia.edu 的电子邮件经过验证
Peter ClarkAllen Institute for Artificial Intelligence (AI2)在 allenai.org 的电子邮件经过验证
Sarah WiegreffeAllen Institute for AI & University of Washington在 allenai.org 的电子邮件经过验证

关注

Peter Hase

PhD Student, University of North Carolina at Chapel Hill

在 cs.unc.edu 的电子邮件经过验证 - 首页

Interpretable Machine Learning Natural Language Processing


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Evaluating explainable AI: Which algorithmic explanations help users predict model behavior? P Hase, M Bansal arXiv preprint arXiv:2005.01831, 2020	283	2020
Open problems and fundamental limitations of reinforcement learning from human feedback S Casper, X Davies, C Shi, TK Gilbert, J Scheurer, J Rando, R Freedman, ... arXiv preprint arXiv:2307.15217, 2023	242	2023
Grips: Gradient-free, edit-based instruction search for prompting large language models A Prasad, P Hase, X Zhou, M Bansal arXiv preprint arXiv:2203.07281, 2022	123	2022
Interpretable image recognition with hierarchical prototypes P Hase, C Chen, O Li, C Rudin Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 7 …, 2019	110	2019
Do language models have beliefs? methods for detecting, updating, and visualizing model beliefs P Hase, M Diab, A Celikyilmaz, X Li, Z Kozareva, V Stoyanov, M Bansal, ... arXiv preprint arXiv:2111.13654, 2021	96*	2021
Fastif: Scalable influence functions for efficient model interpretation and debugging H Guo, NF Rajani, P Hase, M Bansal, C Xiong arXiv preprint arXiv:2012.15781, 2020	90	2020
Leakage-adjusted simulatability: Can models generate non-trivial explanations of their behavior in natural language? P Hase, S Zhang, H Xie, M Bansal arXiv preprint arXiv:2010.04119, 2020	84	2020
Does localization inform editing? surprising differences in causality-based localization vs. knowledge editing in language models P Hase, M Bansal, B Kim, A Ghandeharioun Advances in Neural Information Processing Systems 36, 2024	72	2024
The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations P Hase, H Xie, M Bansal Advances in Neural Information Processing Systems 34, 2021	70	2021
When can models learn from explanations? a formal framework for understanding the roles of explanation data P Hase, M Bansal arXiv preprint arXiv:2102.02201, 2021	64	2021
Rethinking machine unlearning for large language models S Liu, Y Yao, J Jia, S Casper, N Baracaldo, P Hase, X Xu, Y Yao, H Li, ... arXiv preprint arXiv:2402.08787, 2024	34	2024
Foundational challenges in assuring alignment and safety of large language models U Anwar, A Saparov, J Rando, D Paleka, M Turpin, P Hase, ES Lubana, ... arXiv preprint arXiv:2404.09932, 2024	29	2024
Can sensitive information be deleted from llms? objectives for defending against extraction attacks V Patil, P Hase, M Bansal arXiv preprint arXiv:2309.17410, 2023	26	2023
Can language models teach? teacher explanations improve student performance via personalization S Saha, P Hase, M Bansal Advances in Neural Information Processing Systems 36, 2024	21*	2024
Summarization programs: Interpretable abstractive summarization with neural modular trees S Saha, S Zhang, P Hase, M Bansal arXiv preprint arXiv:2209.10492, 2022	16	2022
Low-cost algorithmic recourse for users with uncertain cost functions P Yadav, P Hase, M Bansal arXiv preprint arXiv:2111.01235, 2021	16	2021
Visfis: Visual feature importance supervision with right-for-the-right-reason objectives Z Ying, P Hase, M Bansal Advances in Neural Information Processing Systems 35, 17057-17072, 2022	11	2022
Are hard examples also harder to explain? a study with human and model-generated explanations S Saha, P Hase, N Rajani, M Bansal arXiv preprint arXiv:2211.07517, 2022	10	2022
Shall i compare thee to a machine-written sonnet? an approach to algorithmic sonnet generation J Benhardt, P Hase, L Zhu, C Rudin arXiv preprint arXiv:1811.05067, 2018	5	2018
The unreasonable effectiveness of easy training data for hard tasks P Hase, M Bansal, P Clark, S Wiegreffe arXiv preprint arXiv:2401.06751, 2024	4	2024

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

上传 PDF

关注此作者

引用次数

合著作者

引用